sars-cov-2-variants / lineage-proposals

Repository to propose and discuss lineages
42 stars 2 forks source link

Usher misplacement bugs #1525

Closed Over-There-Is closed 2 months ago

Over-There-Is commented 4 months ago
  1. Most of JN.1.1 sequences are placed on JN.1> G21535C> C1762A> C11747T> C21535G and assigned as JN.1. Only a bit of JN.1.1 sequences placed on JN.1> C1762A> C25521T> C11747T> C5869T> T25521C are assigned as JN.1.1.
  2. KP.1.1 is now placed under KP.2.
  3. LA.2 is now placed under LA.1. @angiehinrichs
Over-There-Is commented 4 months ago

JN.1.1> C9142T> G2035A has been split into several branches

AngieHinrichs commented 4 months ago

Thanks so much @Over-There-Is for reporting these.

  1. Most of JN.1.1 sequences are placed on JN.1> G21535C> C1762A> C11747T> C21535G and assigned as JN.1. Only a bit of JN.1.1 sequences placed on JN.1> C1762A> C25521T> C11747T> C5869T> T25521C are assigned as JN.1.1.

That's really bad! I should be able to fix it in tomorrow's build (2024-04-23).

  1. KP.1.1 is now placed under KP.2.

Unfortunately my usual trick of prune & re-optimize will probably not fix that one. usher and matOptimize use the maximum parsimony method, which sees no difference between G22599C > A24819G and A24819G > G22599C, even though a person with knowledge about convergent mutations (and sample dates, countries etc.) can make a more educated guess.

However, having the designated lineages means that at least sequences will have the correct lineage assignment, even if the tree structure is not quite right there.

  1. LA.2 is now placed under LA.1.

Yes, usher and matOptimize have a weird habit of wanting to serialize mutations that share position (22599C and 22599T in this case, which probably happened independently) -- probably a bug, I will report it. Again, having designated lineages helps to make up for the tree being wrong here.

JN.1.1> C9142T> G2035A has been split into several branches

I hope that fixing #1 above will fix some of those -- but I see a lot of C9142T out of place now that you mention it, I will work on those too, thanks.

FedeGueli commented 4 months ago

@AngieHinrichs is it ok for you to keep this issue open to highlight to you the bugs we will find around?

AngieHinrichs commented 4 months ago

@AngieHinrichs is it ok for you to keep this issue open to highlight to you the bugs we will find around?

Sure!

FedeGueli commented 2 months ago

JN.1.1> C9142T> G2035A has been split into several branches

Now designated JN.1.1.9 via https://github.com/cov-lineages/pango-designation/commit/377e773e03462f2c51c0d0cb2eeb82da9dcc6a2a