Open rosekantor opened 2 years ago
Hi Rose,
A table of the data shown in the second bar from the left on the locations page (says "AA SNVs" or "NT SNVs" on the top, depending on the applied filters) - see screenshot below.
To get the data for the legend, you can select "Download Aggregate Data" (see picture below)
This results in a CSV file, where each unique combination of mutations is aggregated to a row. The mutations are in the form pos|ref|alt
, and are delimited by semicolons. To get the frequencies of single mutations, you'll have to pull apart that mutation string and count each mutation as you go through the rows
I understand this is a bit of work, so I'll make a download item that splits this up and collapses by date like you requested.
A filtered version of the full lineage table obtained by clicking "download" > "consensus mutation". Here, the applied filters do not appear to have any effect- perhaps should be a separate issue.
You're correct - the consensus mutations are calculated across the entire dataset and are not computed based on the user's selections.
I can add a checkbox into the "download" -> "consensus mutation" dialog that specifies whether to use the whole dataset or just the sequences from the user selection.
We're currently working on a refactor of some of the core components of the site – so these changes can't be implemented immediately... maybe a week or two? I'll let you know when it's live.
Albert
Hi Rose,
Apologies for the late reply to this.
1) I've added a download for this data, it's named "Group Counts" under the download button
2) I added another download endpoint for consensus mutations, that also filters on date ranges, locations, etc. Right now it's not linked up to any part of the site (still figuring that out), but it's available as an API endpoint. I've described it here: https://github.com/vector-engineering/covidcg/blob/master/API.md#dynamic-group-mutation-frequencies
Hello,
This site is my go-to resource for identifying lineages based on mutations in wastewater, and I've been sharing it with colleagues who really appreciate it, too. I am now looking specifically for mutations common in California within a date range (for example to check my primers against the prevalent sequences or to know what SNVs I might expect to find in specific wastewater samples).
There are two tables I'm interested in being able to download:
Thanks in advance,
Rose