Will the current times for rules that are readgroup dependent (combine_fq, bwa, merge_bams) be OK if there is only 1 really big read group?
My initial calculation for time was using samples with a max of 60M reads. From there I did 200M / 60 M (assuming we don't have samples with greater than 200M reads) to get the time scale factor and then multiplied run times by that factor. Some samples only have 1 read group though, and so the work won't be distributed across multiple jobs.
These rules should probably have an additional size factor multiple (say 3 for 3 read groups for the test run).
Will the current times for rules that are readgroup dependent (combine_fq, bwa, merge_bams) be OK if there is only 1 really big read group?
My initial calculation for time was using samples with a max of 60M reads. From there I did 200M / 60 M (assuming we don't have samples with greater than 200M reads) to get the time scale factor and then multiplied run times by that factor. Some samples only have 1 read group though, and so the work won't be distributed across multiple jobs.
These rules should probably have an additional size factor multiple (say 3 for 3 read groups for the test run).