tgen / pegasusPipe

MIT License
1 stars 0 forks source link

Freebayes passes even when it crashes #40

Open ryanrichholt opened 6 years ago

ryanrichholt commented 6 years ago

1) Bugs in Freebayes sometimes cause it to crash randomly mid-chromosome 2) Perf stat doesn't report the correct exit code, so the pipeline thinks it passed without errors.

I can try fixing it by updating Freebayes, every recipe was using the same Freebayes binary. But, this means it would change for "frozen" recipes too.

Here are a lot more details on the two issues:

Issue 1:

The first issue: Freebayes occasionally throws errors and exits with a 134. Here is the error message:

currentSequence matching: terminate called after throwing an instance of 'std::out_of_range' what(): basic_string::substr

There is a long discussion about this issue on the Freebayes GitHub repo. It looks like people are still reporting it as a bug. But it's possible that upgrading to a newer version will fix it. https://github.com/ekg/freebayes/issues/6

Issue 2:

Perf stat should always report the exit code of the command it's wrapping, but in this case it does not:

[rrichholt@dback-login1]/scratch/rrichholt/jetstream_projects/smallCOLO829_Somatic_ps201709131621/oeFiles$ perf stat /home/tgenref/pecan/bin/freebayes/bin/freebayes -f /home/tgenref/pecan/bwa_index/hs37d5_plusRibo_plusOncoViruses_plusERCC.fa -b /scratch/rrichholt/jetstream_projects/smallCOLO829_Somatic_ps201709131621/KHS5U/FDAV_COLO829m100x0_1_CL_Whole_T1_S5U_L03520/FDAV_COLO829m100x0_1_CL_Whole_T1_S5U_L03520.proj.md.jr.bam -b /scratch/rrichholt/jetstream_projects/smallCOLO829_Somatic_ps201709131621/KHS5U/FDAV_COLO829m0x100_1_CL_Whole_C1_S5U_L03522/FDAV_COLO829m0x100_1_CL_Whole_C1_S5U_L03522.proj.md.jr.bam -t /home/tgenref/pipeline_v0.4/chrListBED/dir2/Step16.bed --ploidy 2 --min-repeat-entropy 1 > /scratch/rrichholt/jetstream_projects/smallCOLO829_Somatic_ps201709131621/freebayes/FDAV_COLO829m0x100_1_CL_Whole_C1_S5U_L03522-FDAV_COLO829m100x0_1_CL_Whole_T1_S5U_L03520/FDAV_COLO829m0x100_1_CL_Whole_C1_S5U_L03522-FDAV_COLO829m100x0_1_CL_Whole_T1_S5U_L03520_Step16.freebayes.vcf Exception: Unable to read reference sequence base past end of current cached sequence. 16:7058722 7058932-7059044 alignment: CCCTCCCTTCCTCCCTCCCTTCTTTCCTCCCTCCCTCCCTCCCTCCCTTCCTCCCTCCCTTCCTTCCTCCCTTCTTTCCTCCCTCCCTCCCTTCCTCCCTCCCTCCCTCCCT currentSequence: GGCGCATCTGTAATTCCAGACACTCAGAAGTCAGAGGCAGCAGAATCGCTTGAACCTAGGAAGCAAAGGTTGCAGTGATCTGAGATCATACCACTGCACTCCAGCCTGGGTGACAGGGCAAGACTTTGTCTCAAAAAAAAAGTAGAATTCAGTGAACAAGAGTAAATAATTCAAGAAATTCAAGGTTGATAGCTATAAAAATGAAATTCCCTCCCTCCCTCCCTTCCTCCCTCCCTTCCTTCCTCCTCCCTTCCTCCCTCCCTCCCTTCCTCCTCCCTTCCTCCCTCCCTCCCTTCCTCCCTCCCTCCCTTCCTCCCTCCCTCCCTTCCTCCCTCCCTCCCTTCCTCCTCCCTCCCTTCCTCCCTCCCTCCCTTCCTCCTCCCTTCCTCCCTCCCTCCCTTCCTCCCTCCCTCCCTTCCTCCCTCCCTCCCTCCCTCCCTCCCTCCCTCCCTTCCTCCCTCCCTCCCTTCCTCCCTCCCTTCCTCCCTCCCTCCCTCCCTCACTTCCTCCCTCCCTTCCTCCCTCCCTCCCTTCCTCCCTCCCTCCCTTCCTCCCTCCCTCCCTCCCTCCCTTCCTCCCTCCCTTCCTCCCTCCCTCCCTTCCTCCCTCCCTC currentSequence matching: Exception: Unable to read reference sequence base past end of current cached sequence. 16:7058722 7058978-7059071 alignment: CTTCCTCCCTCCCTTCCTCCCTCCCTCCCTTCCTCCCTTCCTTCCTTCCTCCCCCCCTTCCTCCCTCCCTTCTTCCCTCCGTCCCTTCCTCCC currentSequence: GGCGCATCTGTAATTCCAGACACTCAGAAGTCAGAGGCAGCAGAATCGCTTGAACCTAGGAAGCAAAGGTTGCAGTGATCTGAGATCATACCACTGCACTCCAGCCTGGGTGACAGGGCAAGACTTTGTCTCAAAAAAAAAGTAGAATTCAGTGAACAAGAGTAAATAATTCAAGAAATTCAAGGTTGATAGCTATAAAAATGAAATTCCCTCCCTCCCTCCCTTCCTCCCTCCCTTCCTTCCTCCTCCCTTCCTCCCTCCCTCCCTTCCTCCTCCCTTCCTCCCTCCCTCCCTTCCTCCCTCCCTCCCTTCCTCCCTCCCTCCCTTCCTCCCTCCCTCCCTTCCTCCTCCCTCCCTTCCTCCCTCCCTCCCTTCCTCCTCCCTTCCTCCCTCCCTCCCTTCCTCCCTCCCTCCCTTCCTCCCTCCCTCCCTCCCTCCCTCCCTCCCTCCCTTCCTCCCTCCCTCCCTTCCTCCCTCCCTTCCTCCCTCCCTCCCTCCCTCACTTCCTCCCTCCCTTCCTCCCTCCCTCCCTTCCTCCCTCCCTCCCTTCCTCCCTCCCTCCCTCCCTCCCTTCCTCCCTCCCTTCCTCCCTCCCTCCCTTCCTCCCTCCCTC currentSequence matching: terminate called after throwing an instance of 'std::out_of_range' what(): basic_string::substr /home/tgenref/pecan/bin/freebayes/bin/freebayes: Aborted

Performance counter stats for '/home/tgenref/pecan/bin/freebayes/bin/freebayes -f /home/tgenref/pecan/bwa_index/hs37d5_plusRibo_plusOncoViruses_plusERCC.fa -b /scratch/rrichholt/jetstream_projects/smallCOLO829_Somatic_ps201709131621/KHS5U/FDAV_COLO829m100x0_1_CL_Whole_T1_S5U_L03520/FDAV_COLO829m100x0_1_CL_Whole_T1_S5U_L03520.proj.md.jr.bam -b /scratch/rrichholt/jetstream_projects/smallCOLO829_Somatic_ps201709131621/KHS5U/FDAV_COLO829m0x100_1_CL_Whole_C1_S5U_L03522/FDAV_COLO829m0x100_1_CL_Whole_C1_S5U_L03522.proj.md.jr.bam -t /home/tgenref/pipeline_v0.4/chrListBED/dir2/Step16.bed --ploidy 2 --min-repeat-entropy 1':

  51462.535837      task-clock (msec)         #    0.829 CPUs utilized
         2,339      context-switches          #    0.045 K/sec
           117      cpu-migrations            #    0.002 K/sec
        12,021      page-faults               #    0.234 K/sec
97,338,745,802      cycles                    #    1.891 GHz

103,031,568,295 instructions # 1.06 insn per cycle 24,818,887,921 branches # 482.271 M/sec 461,906,920 branch-misses # 1.86% of all branches

  62.081542050 seconds time elapsed

[rrichholt@dback-login1]/scratch/rrichholt/jetstream_projects/smallCOLO829_Somatic_ps201709131621/oeFiles$ echo $? 0

Even though Freebayes crashed after only 1/10th of the chromosome was complete, the exit code is reported as 0 by perf stat. I noticed it while porting the pipeline to Slurm because I removed the perf stat wrappers on the commands since Slurm accounting can handle that job now.