Enh: Add pangolin & nextclade & metadata update support for the nanopore workflow

combat-sars-cov-2 / workbench-deploy

IRIDA and Galaxy deployment manifest definitions and scripts

Apache License 2.0

1 stars 1 forks source link

Enh: Add pangolin & nextclade & metadata update support for the nanopore workflow #8

Open pvanheus opened 3 years ago

pvanheus commented 3 years ago

Is it possible to use sub-workflows for IRIDA? Downstream of consensus we are doing the same analysis (pangolin & nextclade) for both Illumina and Nanopore.

pvanheus commented 3 years ago

As of today, latest pangolin is 3.1.11, latest nextclade is 1.2.3

pvanheus commented 3 years ago

Version updates:

guppyplex is now available via iuc in main toolshed
artic minion has been updated to version 1.2.1 and it requires a 'medaka model' parameter. The model files are here: https://github.com/nanoporetech/medaka/tree/v1.2.3/medaka/data for medaka and r941_min_high_g360 is a sensible default parameter

pvanheus commented 3 years ago

metadata to be provided (with examples)

Nextclade clade - 20E (EU1)
Pango lineage - B.1.177
List of variants (from Nextclade or summarised from ivar variants??) e.g. C241T,T445C,C3037T,C6286T,C11396T,C14408T,C21614T,G22205C,C22227T,A23403G,C26801G,C27944T,C28932T,G29645T
List of variant amino acids (from Nextclade) - e.g. N:A220V,ORF1a:L3711F,ORF1b:P314L,S:L18F,S:D215H,S:A222V,S:D614G
from QC report: pct_N_bases, pct_covered_bases, longest_no_N_run, num_aligned_reads

pvanheus commented 3 years ago

Two more metadata fields to add:

"consensus sequence software name" - this is the name of the pipeline
"consensus sequence software version" - this is the version of the pipeline