Gleeson-Lab / wxs_pipeline

Starting with BAMs and FASTQs, follow GATK 4.0 Best Practices up to generating a joint-genotyped VCF
1 stars 1 forks source link

Adjust Calling Intervals Based on Size of Input Files #9

Open brcopeland opened 2 years ago

brcopeland commented 2 years ago

Larger BAM/FASTQs=smaller intervals and vice versa, etc. Might want to adjust based on number of samples also.

brcopeland commented 2 years ago

For now, amplicon sequencing just calls on the entire genome. This was the impetus for making this possible change, so it is perhaps unnecessary now.

brcopeland commented 2 years ago

I think the default interval lengths should be increased to cut down on the number of jobs being created. We seem to run into throughput bottlenecks just on the basis of the number of jobs so the idea of parallelism isn't really working.