PoonLab / gromstole

Quantifying SARS-CoV-2 VoCs from NGS data of wastewater samples
MIT License
3 stars 4 forks source link

Need to update mutation lists for several other lineages #75

Closed ArtPoon closed 1 year ago

ArtPoon commented 1 year ago
ArtPoon commented 1 year ago

maybe we should just start running Freya instead of having to chase sub-lineages all the time

ArtPoon commented 1 year ago

@Abayomi-Olabode @GopiGugan - I've written an R script (unique-mutations.R) that will process the JSON file generated by our Untitled project to:

  1. determine the frequencies of mutations in genomes that belong to some focus lineage X
  2. generate a barplot of frequencies for those mutations in all other (non-focal) lineages

Example for BA.5.2.1:

Screen Shot 2022-11-27 at 10 47 56 PM
ArtPoon commented 1 year ago

I'll copy my current JSON file to /data/wastewater so you don't have to generate it yourself (but you just have to run Untitled retrieve.py to do so)

ArtPoon commented 1 year ago

Note the above plot is displaying the mean mutation frequencies, averaged across other lineages. If there is a handful of lineages that are fixed for the same mutation, it may still be shown as a low frequency.

ArtPoon commented 1 year ago

This might be more useful:

max.freq <- apply(others, 2, function(x) row.names(others)[which.max(x)])
Abayomi-Olabode commented 1 year ago

Thanks