hodcroftlab / covariants

Real-time updates and information about key SARS-CoV-2 variants, plus the scripts that generate this information.
https://covariants.org/
GNU Affero General Public License v3.0
316 stars 112 forks source link

Programmatic analysis of variants #143

Open maddyboo opened 3 years ago

maddyboo commented 3 years ago

Hi CoVariants team!

As part of the Cambridge Festival @bethsampher & @JamesABaker (both at the Wellcome Genome Campus) ran a hackathon to spark ideas around making use of the large amount of sequencing data coming out of current Covid-19 projects.

A product of that was my (potentially) dominant variant finder. My thinking being, with enough localised data, could particularly dominant variants be spotted from the data programmatically?

Currently it is just using an old snapshot of the CoVariants country data file, however if it would be of some use to yourselves or others outside the project then I'd be happy to work to expand it to do this for the entire dataset routinely.

I'm a scientist, but can claim no expertise in this area (I'm a chemist...), therefore I'm unsure of it's utility or not. If the project is of some use we'd be very grateful for any feedback you could provide! More discussion can be found in the issue here

Thanks for your time,

Maddyboo

anjali-gopinathan commented 3 years ago

An algorithm that can fix this problem as well as the motivating arguments are below.

  1. Alpha started in the UK and successfully outcompeted most other variants to become the dominant strain worldwide
  2. Alpha itself was outcompeted by Delta, which has now become (or is rapidly on the way to becoming the dominant strain worldwide)
  3. Anything that can outcompete Delta is worth looking at carefully and soon
  4. Something is pushing back Delta in India – maybe AY.1, maybe AY.2, maybe something else. But whatever it is, we should understand what is outcompeting Delta.
  5. This concept is related to: https://github.com/wgc-hackathon/covid/issues/14 and https://github.com/hodcroftlab/covariants/issues/197 – namely that the site should be detecting and presenting significant risks, which are indicated by variants that outcompete (push back) strong variants.
  6. A simple algorithm that directly addresses the above and the question in issues 14 and 197 might be this: