Open matt-sd-watson opened 3 years ago
Hi there, thanks for bringing this up.
The two representations are actually equivalent -- in context, outbreak.info represents this mutation as E156G + del157/158, whereas others represent it as del156/157 + R158G. In both cases the meaning is that all three amino acids are replaced by a single glycine.
In general, when an out-of-frame deletion of length 3*k occurs, it will reduce k+1 codons into 1, which codes for a new AA. The ambiguity arises because we consider the first effected codon to have mutated and the last one to have been deleted, whereas others take the opposite approach.
Because there seems to be a consensus among other platforms, we plan to eventually modify our pipeline to follow this convention.
Hi,
Thanks for the clarification! I had assumed that it was probably due to the inherent ambiguity of the out of frame deletion for this variant and how to represent the final amino acid, as opposed to a simple website typo, but I wanted to confirm. I will let you decide how to close this issue if you will update the resource or not to match the others that I have posted.
@matt-sd-watson we're going to leave it open for now and eventually shift to the nomenclature that Nextstrain is using to make it less confusing to compare.
Hello, The website reports the coordinates of the double deletion in the Spike protein for delta (B.1.617.2 sequences) as amino acids 157 and 158 in the Spike: https://outbreak.info/situation-reports?pango=B.1.617.2
However, additional resources report these coordinates as being instead 156 and 157, including:
I am wondering how the discrepancy between the coordinates of these deletions in Delta sequences is occurring between this tool and the others that I have listed here.