replikation / poreCov

SARS-CoV-2 workflow for nanopore sequence data
https://case-group.github.io/
GNU General Public License v3.0
39 stars 16 forks source link

Publish VCF files #266

Open hoelzer opened 3 months ago

hoelzer commented 3 months ago

For many downstream analysis it would be nice to also publish the VCF files with the ref/alternate allele and the frequencies.

hoelzer commented 3 months ago

Ah wait, it seems we publish them to 3.Lineages_Clades_Mutations folder

E.g.

are the final variant calls splitted into pass and fail? (what are the default filter values then?)

MarieLataretu commented 1 month ago

Yes, the called variants get split based on min_depth (DP) and min_depth (QUAL and GQ): https://github.com/artic-network/fieldbioinformatics/blob/1.3.0-dev/artic/vcf_filter.py#L54-L79

The default values in poreCov are 20 for min_depth and ${params.min_depth} = 20 for min_depth - the same as in ARTIC (https://github.com/artic-network/fieldbioinformatics/blob/1.3.0-dev/artic/pipeline.py#L115-L118).

Note, that min_depth and min_depth are not available in ARTIC's master branch, only in 1.3.0-dev.

@jonas-fuchs pointed out that VCFs from ARTIC in Galaxy do have allele frequencies. I didn't have time to test it, but it'd be a medaka tools annotate with additional --dpsp and this script.