nf-core / sarek

Analysis pipeline to detect germline or somatic variants (pre-processing, variant calling and annotation) from WGS / targeted sequencing
https://nf-co.re/sarek
MIT License
384 stars 400 forks source link

joint genotyping optimal strategy #923

Open TonyKess opened 1 year ago

TonyKess commented 1 year ago

Working with a large-ish dataset (30TB raw reads), and was wondering what a good strategy would be for working with the joint_germline option to get cohort SNP calls. Would it make more sense to try to genotype all individuals together, or parallelize the joint genotyping runs across smaller groups of samples and then filter and combine with bcftools across jointgermline outputs?

amizeranschi commented 1 year ago

@TonyKess I'm in a similar boat, but dealing with a slightly smaller dataset (5 TB). Have you made any progress with this scenario?