sars-cov-2-variants / lineage-proposals

Repository to propose and discuss lineages
42 stars 2 forks source link

HV.2+S:D1153A (33 seqs, 6 countries) #1270

Closed aviczhl2 closed 5 months ago

aviczhl2 commented 8 months ago

HV.2+A25020C(S:D1153A)

GISAID query: A25020C, G14559A,C5835T No. of seqs: 18( Australia 4 Denmark 2 UK 7 France 1 South Africa 1, plus 3 unknown country OY seqs) 15 on GISAID, 3 currently not

@AngieHinrichs @yatisht, it seems that OY seqs may belong to multiple countries including UK, Denmark and USA and maybe others, however they are not labeled on usher. Is it easy to fix?

First: EPI_ISL_18577680, France, 2023-11-20 Latest: CLIMB-CM7YJ4MJ, UK, 2023-12-29

usher

image

FedeGueli commented 8 months ago

i ve messed up with HV.1.2 now fixed. The interesting thing of this one is that it started spreading while JN.1 was already well on the rise and it popped up in multiple countries in few weeks. This may point at an undersampled "reservoir" somewhere or at something really fast.

AngieHinrichs commented 8 months ago

@AngieHinrichs @yatisht, it seems that OY seqs may belong to multiple countries including UK, Denmark and USA and maybe others, however they are not labeled on usher. Is it easy to fix?

Hopefully it will resolve itself within a few days. In general, when an INSDC (GenBank/ENA/DDBJ) sequence has no name, it is a problem of synchronization between international databases. I download INSDC sequences using NCBI Datasets, which uses data aggregated by NCBI Virus, which draws from NCBI GenBank and the NCBI version of BioSample. European submitters tend to submit their sequences and metadata to the European Nucleotide Archive (ENA). For some reason the metadata (including names) sometimes lags the sequence data when NCBI imports from ENA, and/or when data are further packaged by NCBI Virus and NCBI Datasets. (I used to think of it all as just "GenBank" but have learned that there is a lot going on behind the scenes!)

If it doesn't get better within another day or two, let me know and I can ask NCBI why the metadata for those sequences aren't yet included in the download files. (I think there are sometimes undetected job failures and they can re-run a batch.)

FedeGueli commented 6 months ago

lets see how it goes in Oceania in the next batches likely slow

aviczhl2 commented 6 months ago

+1 Greece

aviczhl2 commented 5 months ago

No new seqs