CDCgov / SARS-CoV-2_Sequencing

A collection of sequencing protocols and bioinformatic resources for SARS-CoV-2 sequencing.
Apache License 2.0
344 stars 83 forks source link

Great initiative! #3

Closed Kirk3gaard closed 4 years ago

Kirk3gaard commented 4 years ago

Awesome to see some efforts for standardisation.

Looking forward to follow inputs for QC measures (Quality management).

I think it would be relevant to tag genomes with the protocol that was used to generate them. E.g. artic network wetlab protocol+ bioinformatics pipeline. As some genomes might have protocol specific issues that need to be taken into account when comparing sequences.

A raw data catalogue for the different protocols would also be super helpful for people setting up and testing bioinformatics pipelines.

dmaccannell commented 4 years ago

These are all great ideas.

We deliberately left QC for the initial, because we wanted to get feedback from the community on what sort of thresholds, quality measures and process controls were actually being used, and what might be the most feasible. We'd definitely welcome any edits to that section, but should be adding more shortly.

Also plan to expand on bioinformatic tools and workflows considerably, and (again) would welcome contribs.

Consensus sequences in GISAID and NCBI usually have a few protocol notes attached in the metadata, but it might be worthwhile to recommend a more detailed summary of the end to end process, including wetlab/bioinformatics.

Plan to add an index, and links to reference and gather together some example data from different sequencing platforms and protocols for groups that are trying to validate their bioinformatic workflows.