Working with a large-ish dataset (30TB raw reads), and was wondering what a good strategy would be for working with the joint_germline option to get cohort SNP calls. Would it make more sense to try to genotype all individuals together, or parallelize the joint genotyping runs across smaller groups of samples and then filter and combine with bcftools across jointgermline outputs?
Working with a large-ish dataset (30TB raw reads), and was wondering what a good strategy would be for working with the joint_germline option to get cohort SNP calls. Would it make more sense to try to genotype all individuals together, or parallelize the joint genotyping runs across smaller groups of samples and then filter and combine with bcftools across jointgermline outputs?