GenSpectrum / cov-spectrum-website

A web platform to detect and analyze variants of SARS-CoV-2
https://cov-spectrum.org
GNU General Public License v3.0
61 stars 14 forks source link

ENH: Option to use covSpectrum in `submission date mode` by simple extension to LAPIS #648

Open corneliusroemer opened 1 year ago

corneliusroemer commented 1 year ago

I know you're working on the submission date filters.

This could be a more broadly useful feature.

Here's an example: Because some countries submit much faster than others, if you calculate world wide growth advantages, there's a huge bias towards things circulating in countries that are fast at circulating, e.g. BQ.1.1.20 in Denmark which isn't the fastest thing around worldwide, it's so high up simply because Denmark is the country with large numbers of samples and quick turnaround:

image

Of course I could filter to a country, but here's a better way: allow growth advantages to be calculated in submission date mode. this means: simply use submission date in place of collection date. This way, growth advantage becomes roughly unbiased by things like submission delay as long as submission delays are constant across time (which they approximately are).

This may actually be easier to implement than a custom submission date filter, one just would swap out the date type to be used and no other code would need to be changed - if a flag is passed "submission_date=True" to LAPIS, LAPIS simply return submission date in place of collection date. Done!

Of course this would not interact at all with collection date - so this is not the best solution for all cases, but this could very well help broaden the usefulness.

Right now, @FedeGueli and I calculate these metrics manually from GISAID which is silly ;)

https://docs.google.com/spreadsheets/d/1sMCQyPfMG-pqd8Z0aoV6aJRHCc4vXusGl7pEJ68j10w/edit#gid=0

corneliusroemer commented 1 year ago

Here is BQ.1.1.20 in Denmark, as you can see, the growth advantage is much more realistic

image

I'd expect submission growth advantages to be generally less biased by submission speed - this could be very useful.

FedeGueli commented 1 year ago

Agree even if it could be counter intuitive at first glance, thanks to.the Cormelius intuition we are testing it since a couple of months and it is very sensitive: we were able to understand earlier that some bf.7.sublimeage growth wasnt so high or tp catch very early the ba.2.75 comeback.

I suggest further to add to collection charts a "8 doubling time column" (from 25 to 200 50 to 400 and so on showing the last one) based on submission date calculated in days