tjbencomo / ngs-pipeline

Pipeline for Somatic Variant Calling with WES and WGS data
MIT License
20 stars 4 forks source link

Re-examine WES resource allocations #74

Closed tjbencomo closed 3 years ago

tjbencomo commented 3 years ago

Will the current times for rules that are readgroup dependent (combine_fq, bwa, merge_bams) be OK if there is only 1 really big read group?

My initial calculation for time was using samples with a max of 60M reads. From there I did 200M / 60 M (assuming we don't have samples with greater than 200M reads) to get the time scale factor and then multiplied run times by that factor. Some samples only have 1 read group though, and so the work won't be distributed across multiple jobs.

These rules should probably have an additional size factor multiple (say 3 for 3 read groups for the test run).