cov-lineages / constellations

Other
43 stars 17 forks source link

BA.1-like vs BA.2 #46

Closed hsnguyen closed 2 years ago

hsnguyen commented 2 years ago

Hi team, We have 2 sequences from the same sample (SNP distance = 0) : hCoV-19/Australia/QLD2584/2021 and hCoV-19/Australia/QLD2568/2021 but got assigned to different sub-lineage of Omicron using the newest constellations.

taxon lineage conflict ambiguity_score scorpio_call scorpio_support scorpio_conflict version pangolin_version pangoLEARN_version pango_version status note
hCoV-19/Australia/QLD2584/2021 BA.1 0.0 0.9397562119081107 Probable Omicron (BA.1-like) 0.517200 0.258600 PLEARN-v1.2.101 3.1.17 2021-11-25 v1.2.101 passed_qc scorpio call: Alt alleles 30; Ref alleles 15; Amb alleles 10; Oth alleles 3; scorpio replaced lineage assignment AZ.2
hCoV-19/Australia/QLD2568/2021 BA.2 0.0 1.0 BA.2-like 0.984800 0.015200 PLEARN-v1.2.101 3.1.17 2021-11-25 v1.2.101 passed_qc scorpio call: Alt alleles 65; Ref alleles 1; Amb alleles 0; Oth alleles 0; scorpio replaced lineage assignment AZ.2

The only difference between 2 sequences is that QLD2568 has better quality than QLD2584. If I run scorpio haplotype against BA.2 constellation, QLD2568 is a perfect match while QLD2584 has only 1 ambiguous allele there.

query ref_count alt_count ambig_count other_count support conflict orf1ab:S135R orf1ab:T842I orf1ab:G1307S nuc:C4321T orf1ab:L3027F nuc:A9424G orf1ab:T3090I orf1ab:L3201F nuc:C10198T nuc:G10447A nuc:C12880T nuc:C15714T nuc:C15714T orf1ab:R5716C orf1ab:T6564I nuc:A20055G spike:T19I del:21633:9 nuc:T22200G spike:S371F spike:T376A spike:D405N spike:R408S nuc:C26060T nuc:C26858T orf6:D61L n:S413R
hCoV-19/Australia/QLD2584/2021 0 26 1 0 0.963000 0.000000 R I S T F G I F N A T T T C I G I 3 G F A N S T T L R
hCoV-19/Australia/QLD2568/2021 0 27 0 0 1.000000 0.000000 R I S T F G I F T A T T T C I G I 3 G F A N S T T L R

Please find attached FASTA file for your convenience (QLD2584 has been removed from GISAID due to duplicaiton) QLD2568-QLD2584.zip

We call it BA.2 but just want let you know the issue. Thanks,

rmcolq commented 2 years ago

Looks like the threshold for number of ambiguities allowed were too strict to allow this sequence through. I agree this is undesirable behaviour and will update the constellation to be more flexible