replikation / poreCov

SARS-CoV-2 workflow for nanopore sequence data
https://case-group.github.io/
GNU General Public License v3.0
39 stars 16 forks source link

Report classification as VOI, VOC, ... of lineages in results table #190

Closed oliverdrechsel closed 2 years ago

oliverdrechsel commented 2 years ago

Is your feature request related to a problem? Please describe. A colleague got a bit lost with the various VOCs, VOIs and VUMs from WHO and ECDC and their respective Pangolin lineage naming.

Describe the solution you'd like It would be cool, if porecov could classifiy the reported Pangolin lineages as "variant of concern". This might require a quite frequently updated table and would be a snapshot view as the VOCs are under constant re-evaluation.

Describe alternatives you've considered A manual assignment is possible, but could be tedious due to the number of VOxs

replikation commented 2 years ago

@oliverdrechsel are WHO labels (the greek letters) by default already the VOC / VOI? Because these are currently shown and we could color code them like in orange with a legend explaining that highlighted Variants are VOC/VOI. This would of course not distinguish between VOC or VOI. An automatic way would be indeed favorable.

This is the nextclade definition table

This is the current WHO source for VOCs and VOI they keep updating. We could also link this to the final report. But i want to avoid writing some strange HTML parser for the WHO page

oliverdrechsel commented 2 years ago

Writing a HTML parser for sure is something that should be avoided to not get crazy. On the WHO page there are VOC, VOI and VUM.

I don't know, if there's anywhere out there a machine readable version of this information. I think @hoelzer once worked on something similar.

If it's not possible to retrieve the data from anywhere, maybe this feature request could be dropped?

replikation commented 2 years ago

i keep the issue open, if we find a stable source to extract the data at some point

hoelzer commented 2 years ago

Good idea, but it's even a bit more complicated bc/ of sublineages. E.g. BA.1 should also classify as VOC (Omicron). Or AY.* is also VOC Delta.

But I think there are machine-readable and updated resources we could hijack. I will check

hoelzer commented 2 years ago

I think we could use this:

https://github.com/3dgiordano/SARS-CoV-2-Variants/blob/main/data/variants.csv

From the Readme:

Naming: The naming system used is a mix between the names denominated by WHO, the Pango Lineage system and some Nextstrain Clade names.

The project use the denomination of:

WHO/CDC/ECDC/PHE Variants of Concern (VOC) WHO/CDC/ECDC Variants of Interest (VOI) WHO Alerts for Further Monitoring (AFM) CDC Variants Being Monitored (VBM) ECDC Variants Under Monitoring (VUM) PHE Variants Under Investigation (VUI)

We could use some different color highlighting for

replikation commented 2 years ago

question is now mostly how to add it to the HTML? color code the pangolin lineage by 2 colors? VOC (red) /VOI (orange)? none of the 2 will be still black. want to avoid in adding to many columns and overloading the html

hoelzer commented 2 years ago

yes, agree. We already have quite some columns now (also with the insertions, ...) and this makes the table already quite long. So colors would be preferable I would say. Any wishes here, @oliverdrechsel ? I like the "ample" scheme style with maybe three colors or red/roange/none as suggested by @replikation

RaverJay commented 2 years ago

This info could also be added to the Pangolin column in the html (and as separate column in the csv/xlsx output)

replikation commented 2 years ago

@RaverJay what are you using for rendering? HTML usually supports "tabs" like in R markdown this way we can keep the important parts in one tab and the additional things in another?

hoelzer commented 2 years ago

Yeah, but just coloring the lineage name in the pangolin column would be also a way and does not need additional columns

On Wed, 26 Jan 2022, 20:39 Christian Brandt, @.***> wrote:

@RaverJay https://github.com/RaverJay what are you using for rendering? HTML usually supports "tabs" like in R markdown this way we can keep the important parts in one tab and the additional things in another?

— Reply to this email directly, view it on GitHub https://github.com/replikation/poreCov/issues/190#issuecomment-1022538050, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADN2CZ6ZTZXZTLWX6LKYGV3UYBEXFANCNFSM5LJ3MSBQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you were mentioned.Message ID: @.***>

RaverJay commented 2 years ago

@RaverJay what are you using for rendering? HTML usually supports "tabs" like in R markdown this way we can keep the important parts in one tab and the additional things in another?

It's just html written 'by hand' I think single paged has a lot of benefit though

Yeah, but just coloring the lineage name in the pangolin column would be also a way and does not need additional columns

Imho writing VOC also helps a lot instead of only applying color (where people have to consult a legend first to know what is what)

hoelzer commented 2 years ago

Imho writing VOC also helps a lot instead of only applying color (where people have to consult a legend first to know what is what)

true, +1