Open jessres opened 1 week ago
Hi @jessres, thanks for your request. I'm seeing three potential solutions to this issue:
qc_report.txt
, e.g. GC After Trimming
, Reference Length Coverage After Trimming
, and Average Q Score After Trimming
. Then, the main MycoSNP workflow could be run with only those samples with acceptable coverage. However, this approach wouldn't allow you to use any post-alignment QC metrics such as Mean Coverage Depth
and Genome Fraction at 10X
(new metric coming in v1.6), and this would still require manual assessment of the QC metrics to determine which samples to exclude from the main MycoSNP workflow.--skip_combined_analysis
parameter, so the MycoSNP workflow will only run through the alignment and qc report steps. Then, the full pipeline could be rerun without the failing samples. This also isn't ideal because the trimming/alignment/QC steps would be performed again for all the passing samples, but I thought I'd mention it as a potential option to save some time with the current version, in case you weren't already aware.Hello @zmudge3 we appreciate such a quick response! Ideally option 3 is exactly what we are looking for. When do you expect v1.6 to be released?
Got it, thank you. We're hoping to have it released by the end of the year.
Is your feature request related to a problem? Please describe. We ran into an issue where too many low coverage samples resulted in a pretty much empty vcf-to-fasta file, affecting the results of the passing samples and being unable to generate a phylogenetic tree. Further investigation shows that even 2 - 3 very low coverage samples can affect the accuracy of the phylogenetic tree. Describe the solution you'd like We would like to see failed low coverage samples be removed before vcf-to-fasta generation so that only passing samples are used for the core genome and results of the phylogenetic tree are accurate. QC results should still include all samples.
Describe alternatives you've considered Alternatively, we have considered re-analyzing the run with just passing samples however, this negatively impacts our automated workflow and TAT.
Additional context Add any other context or screenshots about the feature request here.