nextstrain / forecasts-ncov

SARS-CoV-2 variant growth rates and frequency forecasts
https://nextstrain.org/sars-cov-2/forecasts/
7 stars 2 forks source link

Replace "recombinant" clade with label "other" #45

Closed trvrb closed 1 year ago

trvrb commented 1 year ago

Description of proposed changes

With the rise of XBB viruses there are a large number of "recombinant" clades that no longer mean the same thing they used to. I think clearer and less noisy to just always collapse "recombinant" into the "other" category.

@joverlee521: I wasn't sure if you thought this needed more of a warning to users. I was basically trying to mirror how NA is replaced with other.

Testing

Testing locally gives expected results:

dropping-recombinant

Though looks like I need to fix tests. I'll look into this.

joverlee521 commented 1 year ago

With the rise of XBB viruses there are a large number of "recombinant" clades that no longer mean the same thing they used to. I think clearer and less noisy to just always collapse "recombinant" into the "other" category.

Is there any reason we would want to support analysis of old dates with "recombinant" as an output variant? If so, we could use --force-include-clades recombinant=other to force the collapse without hard-coding it into the script.

joverlee521 commented 1 year ago

Ran model with clade counts produced by the test run and uploaded to s3://nextstrain-data/files/workflows/forecasts-ncov/trial/collapse-recombinants/gisaid/nextstrain_clades/global/mlr/2023-10-31_results.json

Comparing current results from 2023-10-30 with the trial results :

Screenshot 2023-10-31 at 2 56 40 PM Screenshot 2023-10-31 at 2 56 58 PM

Screenshot 2023-10-31 at 2 57 10 PM Screenshot 2023-10-31 at 2 57 22 PM