sars-cov-2-variants / lineage-proposals

Repository to propose and discuss lineages
43 stars 2 forks source link

Upward trend of Orf8:S67F in XBB* #251

Closed xz-keg closed 1 year ago

xz-keg commented 1 year ago

Orf8:S67F is a common mutation. Lots of XBB* has this branch.

I just noticed that there's a stable upward trend of the proportion of Orf8:S67F inside XBB. It is a sum of seqs with orf8:S67F on all XBB branches and not co-occurence effects, and as it appears everywhere in the world it is not asymmetric submission delay effects.

Which is quite difficult to understand given the G8* truncation.

Screenshot 2023-06-26 at 01 09 42
FedeGueli commented 1 year ago

i think @ryhisner have studied this mutation a bit deeper than the rest, maybe he wants to share its finding here.

FedeGueli commented 1 year ago

cc @ryhisner do you want to share your findings on this mutation?

ryhisner commented 1 year ago

I don't think this mutation has any fitness benefits, which seems more obvious than ever now that it occurs behind a stop codon. Instead, I think it is in an extraordinarily favorable position for a C->T mutation to occur through APOBEC3 deamination.

APOBEC3 is a host protein that causes C->T mutations through deamination of cytosine. It's known that some nucleotide contexts are much more favorable than others for this mutation. ORF8:S67F has a perfect nucleotide context since it's surrounded by T's or A's. Furthermore, in the secondary RNA structure, nucleotides that are unpaired and at the end of stem-loops are far more likely to be subject to C->T deamination. C28093 is unpaired and located at the end of a long stem-loop in the secondary RNA structure according to the only map of secondary RNA structure I've ever located.

image image

Here's a figure from another study that made a graph for all 16 possible nucleotide contexts for each type of transition mutation. C->T mutation contexts are the light blue bars. You can see that a context of TCT is extreme favorable.

image