blab / cartography

Dimensionality reduction distills complex evolutionary relationships in seasonal influenza and SARS-CoV-2
https://doi.org/10.1101/2024.02.07.579374
MIT License
4 stars 1 forks source link

Refine the SARS-CoV-2 recombinant dataset to include parental lineages of recombinants #59

Closed huddlej closed 1 year ago

huddlej commented 1 year ago

Adds a new subsampling rule to the workflow for late SARS-CoV-2 data (2022-2023) that tries to include 10 samples per pango lineage for all lineages listed on this site of known recombinant lineages and their parental lineages.

There were 10 recombinant lineages listed on the site and 5 of these had enough data for the recombinant lineage and both parental lineages to inspect the embeddings. The lineages we had enough data for were XD, XE, XG, XBB, and XBL. The lineages we didn’t have enough data for were XC, XF, XAY, XBC, or XBF.

This PR updates the figures and Auspice JSON for the late SC2 workflow to reflect these newly included lineages.

Closes #57