Open pvanheus opened 3 years ago
Hi Peter,
Here are some grammar mistakes and some things that may become potential issues for those working through the tutorial.
nucleocapid (N) should be nucleocapsid
Sequencing of viral samples can be used to (missing word)
Lots of grammar mistakes in the sentence starting with "Detection of mutations among the circulating SARS-CoV-2 strains...", so I'll just write it out completely for a copy and paste below
Detecting mutations among the circulating SARS-CoV-2 strains is important to infer the lineages within which these strains fall and is therefore important in tracing the emergence of new strains, both at a national and local level. Here are some important definitions:
The definition for epidemiology makes one none the wiser. Can you try:
Epidemiology: the study of the patterns, the spread and the causes of diseases and disorders within a given population and the application of this knowledge to prevent and control health problems
Is the use of pathogen genomic data to determine the distribution and spread of an infectious diseases in a specified population and the application
the virus genome can be sequenced and this data can be used to monitor the emergence of important new variants, and to monitor the trends after an intervention
short read sequence data with the aim of identifying the
(the datasets import) without the "zenodo" part
Check that the datatype....?(the sentence is not complete)
The tool seems to be called QualiMap BamQC and no just BamQC. I struggled to find it in a fussy instance like usegalaxy.eu
In the BamQC output, examine the report for *???????, pay special attention to the Mean Coverage (in section Coverage) and the Coverage across reference, Coverage Histogram and Coverage Histogram (0-50X) plots. (does that part about which report should be examined need to be there? The sentence seems to be missing the report name - which was already mentioned in the previous line anyway. Also maybe there should be a fullstop, instead of a comma.)
Open the [ivar variants]{toolshed.g2.bx.psu.edu/repos/iuc/ivar_variants/ivar_variants/1.3.1+galaxy2} tool.
"After setting these parameters Execute the tool." should be under point 5 and not under point 3. I nearly clicked Execute, only to be given more instructions that were required before executing.
Rename the reference sequence with Text transformation with sed (The instructions are not clear that the vcf should be selected when running sed
)
".... changes amino acid 614 in this protein from a Aspartate (Asp or D) (we normally don't say Aspartate, but Aspartic acid - assuming that it's not automatically ionized)
There are no solutions
The existing [SARS-CoV-2 variant analysis] tutorial is focused on the analysis of metagenomic sequencing data. The majority of data being produced, however, uses the ARTIC amplicon protocol (for Illumina and Nanopore). The RECoVERY, SANBI and the Galaxy COVID-19 project have produced workflows for SARS-CoV-2 amplicon analysis.
A new tutorial is needed walking through at least one of these workflows. It can also mention the use of Nextclade and Pangolin.
I (@pvanheus) propose to develop such a tutorial and refine it during the ASBCB Omics Codeathon: https://datascience.nih.gov/news/participant-applications-asbcb-omics-codeathon in June.
To fill out that form, enter your details, mention the Galaxy SARS-CoV-2 Amplicon tutorial and then select:
(this last one because our project is not on the official list (yet))