COG-UK / dipi-group

Data integrity and pipeline integration working group
4 stars 1 forks source link

Collection date of B.1.1.7 sequences PHEC-U303UC24 and PHEC-U303UC33 #175

Closed AngieHinrichs closed 2 years ago

AngieHinrichs commented 2 years ago

In cog_metadata.csv, PHEC-U303UC24 has the name "England/PHEC-U303UC24/2020" and a sample collection date of 2020-03-20 -- but it is clearly an Alpha/B.1.1.7 sequence, so March 2020 seems unlikely. FWIW on the UCSC/UShER tree, it is placed on the same node as England/CAMC-13EE58B/2021 (2021-03-14) and England/PHEC-3046E9/2021 (2021-03-29).

In GISAID (EPI_ISL_3776296), is "England/PHEC-U303UC24/2021" and has a .1 version with a sample collection date of 2021-08-24 and a .2 version with 2020-03-20. Both versions are assigned B.1.1.7 and seem to have the same AA substitutions listed.

So the sample collection date is consistent between COG-UK downloads and GISAID, although the GISAID name still ends in 2021.

PHEC-U303UC33 / EPI_ISL_3776297 has the same thing going on (sample collection date changed from 2021-08-24 to 2020-03-20), except it's most similar to England/PHEC-U303U987/2021 (2021-03-22) and England/PHEC-3032B9/2021 (2021-03-21) plus S:A575S.

2021-03-20 not 2020-03-20?

Thanks!

SamStudio8 commented 2 years ago

Hi @AngieHinrichs, we've fed these back to the originating organisations who will hopefully update the relevant databases.