icbi-lab / nextNEOpi

nextNEOpi: a comprehensive pipeline for computational neoantigen prediction
Other
65 stars 23 forks source link

Bwa terminated with an error exit status (137) -- Execution is retried #50

Closed amoschoomy closed 9 months ago

amoschoomy commented 10 months ago

Hi again, I am running the pipeline again but this time I am encountering the error as stated in the title. The logs are below

This is my slurm logs

[-        ] process > merge_fastq                    -
[2b/15d707] process > RegionsBedToIntervalList (R... [100%] 1 of 1, cached: 1 ✔
[0e/398463] process > BaitsBedToIntervalList (Bai... [100%] 1 of 1, cached: 1 ✔
[b9/31a70e] process > preprocessIntervalList (pre... [100%] 1 of 1, cached: 1 ✔
[9c/873265] process > SplitIntervals (SplitInterv... [100%] 1 of 1 ✔
[89/c01f38] process > IntervalListToBed (BedFromI... [100%] 1 of 1, cached: 1 ✔
[1b/11c840] process > ScatteredIntervalListToBed ... [100%] 40 of 40 ✔
[09/3b3516] process > FastQC (sample1 : normal_DNA)  [100%] 6 of 6, cached: 6 ✔
[eb/2c43dc] process > fastp (sample1 : tumor_DNA)    [100%] 6 of 6, cached: 6 ✔
[df/114487] process > FastQC_trimmed (sample1 : n... [100%] 6 of 6, cached: 6 ✔
[-        ] process > make_uBAM                      [  0%] 0 of 4
[43/c4e6c7] process > Bwa (sample1 : tumor_DNA)      [ 20%] 1 of 5, failed: 1...
[-        ] process > merge_uBAM_BAM                 -
[-        ] process > MarkDuplicates                 -
[-        ] process > alignmentMetrics               -
[-        ] process > scatterBaseRecalGATK4          -
[-        ] process > gatherGATK4scsatteredBQSRta... -
[-        ] process > scatterGATK4applyBQSRS         -
[-        ] process > GatherRecalBamFiles            -
[-        ] process > GetPileup                      -
[-        ] process > Mutect2                        -
[-        ] process > gatherMutect2VCFs              -
[-        ] process > FilterMutect2                  -
[-        ] process > HaploTypeCaller                -
[-        ] process > CNNScoreVariants               -
[-        ] process > MergeHaploTypeCallerGermlin... -
[-        ] process > FilterGermlineVariantTranches  -
[-        ] process > IndelRealignerIntervals        -
[-        ] process > GatherRealignedBamFiles        -
[-        ] process > VarscanSomaticScattered        -
[-        ] process > gatherVarscanVCFs              -
[-        ] process > ProcessVarscan                 -
[-        ] process > FilterVarscan                  -
[-        ] process > MergeAndRenameSamplesInVars... -
[-        ] process > MantaSomaticIndels             -
[-        ] process > StrelkaSomatic                 -
[-        ] process > finalizeStrelkaVCF             -
[-        ] process > mkHCsomaticVCF                 -
[-        ] process > VepTab                         -
[-        ] process > mkCombinedVCF                  -
[-        ] process > VEPvcf                         -
[-        ] process > ReadBackedphasing              -
[-        ] process > AlleleCounter                  -
[-        ] process > ConvertAlleleCounts            -
[-        ] process > Ascat                          -
[-        ] process > SequenzaUtils                  -
[-        ] process > gatherSequenzaInput            -
[-        ] process > Sequenza                       -
[7c/3377e8] process > make_CNVkit_access_file (mk... [100%] 1 of 1, cached: 1 ✔
[-        ] process > CNVkit                         -
[-        ] process > Clonality                      -
[-        ] process > MutationalBurden               -
[-        ] process > MutationalBurdenCoding         -
[-        ] process > mhc_extract                    -
[-        ] process > pre_map_hla                    -
[-        ] process > OptiType                       -
[06/30971f] process > pre_map_hla_RNA (sample2)      [100%] 2 of 2, cached: 2 ✔
[bf/65d76b] process > OptiType_RNA (sample1)         [100%] 2 of 2, cached: 2 ✔
[-        ] process > run_hla_hd                     -
[-        ] process > get_vhla                       -
[-        ] process > Neofuse                        -
[-        ] process > publish_NeoFuse                -
[-        ] process > add_geneID                     -
[-        ] process > gene_annotator                 -
[-        ] process > pVACseq                        -
[-        ] process > concat_pVACseq_files           -
[-        ] process > aggregated_reports             -
[-        ] process > pVACtools_generate_protein_seq -
[-        ] process > pepare_mixMHC2_seq             -
[-        ] process > mixMHC2pred                    -
[-        ] process > addCCF                         -
[-        ] process > make_epitopes_fasta            -
[-        ] process > blast_epitopes                 -
[-        ] process > add_blast_hits                 -
[-        ] process > csin                           -
[-        ] process > immunogenicity_scoring         -
[2e/46c154] process > mixcr (sample1 : tumor_RNA)    [ 33%] 2 of 6, cached: 2
[-        ] process > collectSampleInfo              -
[-        ] process > multiQC                        -
[29/6fb93c] NOTE: Process `Bwa (sample2 : tumor_DNA)` terminated with an error exit status (137) -- Execution is retried (1)

This is the nextflow logs:

~> TaskHandler[id: 28; name: Bwa (sample1 : normal_DNA); status: RUNNING; exit: -; error: -; workDir: /QRISdata/Q6373/results/results_1/work/f3/15a0370e23d72ed33088db22c6a446]
~> TaskHandler[id: 20; name: Bwa (sample2 : tumor_DNA); status: RUNNING; exit: -; error: -; workDir: /QRISdata/Q6373/results/results_1/work/29/6fb93cda2f5fe0f762967a3d788ae8]
Sep-25 15:39:33.952 [Task submitter] DEBUG n.processor.TaskPollingMonitor - %% executor local > tasks in the submission queue: 10 -- tasks to be submitted are shown below
~> TaskHandler[id: 26; name: make_uBAM (sample2 : normal_DNA); status: NEW; exit: -; error: -; workDir: /QRISdata/Q6373/results/results_1/work/79/143b9f1f74e8db822efe77d3f7c653]
~> TaskHandler[id: 29; name: make_uBAM (sample1 : normal_DNA); status: NEW; exit: -; error: -; workDir: /QRISdata/Q6373/results/results_1/work/a2/a089c415fad2d56cf6abc00f657ffe]
~> TaskHandler[id: 18; name: make_uBAM (sample1 : tumor_DNA); status: NEW; exit: -; error: -; workDir: /QRISdata/Q6373/results/results_1/work/d5/664efa51d24be72c37b38855b69b6a]
~> TaskHandler[id: 19; name: make_uBAM (sample2 : tumor_DNA); status: NEW; exit: -; error: -; workDir: /QRISdata/Q6373/results/results_1/work/c7/73aaf11c16275e3ef7678d1810ac0e]
~> TaskHandler[id: 17; name: Bwa (sample1 : tumor_DNA); status: NEW; exit: -; error: -; workDir: /QRISdata/Q6373/results/results_1/work/43/c4e6c79262aaab65bb5f5df8930360]
~> TaskHandler[id: 25; name: Bwa (sample2 : normal_DNA); status: NEW; exit: -; error: -; workDir: /QRISdata/Q6373/results/results_1/work/46/9bea87330f9e6abb241f98f73bc426]
~> TaskHandler[id: 38; name: mixcr (sample2 : tumor_DNA); status: NEW; exit: -; error: -; workDir: /QRISdata/Q6373/results/results_1/work/52/ecf74a82aff60aa6a7321807b1d01c]
~> TaskHandler[id: 42; name: mixcr (sample1 : normal_DNA); status: NEW; exit: -; error: -; workDir: /QRISdata/Q6373/results/results_1/work/f3/fd7b1b2f102fcb967e9d6ef665ae4f]
~> TaskHandler[id: 41; name: mixcr (sample2 : normal_DNA); status: NEW; exit: -; error: -; workDir: /QRISdata/Q6373/results/results_1/work/e3/ac5451a02a26825da75ea085a49583]
~> TaskHandler[id: 37; name: mixcr (sample1 : tumor_DNA); status: NEW; exit: -; error: -; workDir: /QRISdata/Q6373/results/results_1/work/0b/db949982d4b866e1a4ef77e75dc083]
Sep-25 15:44:21.547 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 20; name: Bwa (sample2 : tumor_DNA); status: COMPLETED; exit: 137; error: -; workDir: /QRISdata/Q6373/results/results_1/work/29/6fb93cda2f5fe0f762967a3d788ae8]
Sep-25 15:44:21.554 [Task monitor] INFO  nextflow.processor.TaskProcessor - [29/6fb93c] NOTE: Process `Bwa (sample2 : tumor_DNA)` terminated with an error exit status (137) -- Execution is retried (1)
Sep-25 15:44:21.560 [Task submitter] DEBUG n.executor.local.LocalTaskHandler - Launch cmd line: /bin/bash -ue .command.run
Sep-25 15:44:21.561 [Task submitter] INFO  nextflow.Session - [43/c4e6c7] Submitted process > Bwa (sample1 : tumor_DNA)
Sep-25 15:44:33.140 [Task monitor] DEBUG n.processor.TaskPollingMonitor - !! executor local > tasks to be completed: 2 -- submitted tasks are shown below
~> TaskHandler[id: 28; name: Bwa (sample1 : normal_DNA); status: RUNNING; exit: -; error: -; workDir: /QRISdata/Q6373/results/results_1/work/f3/15a0370e23d72ed33088db22c6a446]
~> TaskHandler[id: 17; name: Bwa (sample1 : tumor_DNA); status: RUNNING; exit: -; error: -; workDir: /QRISdata/Q6373/results/results_1/work/43/c4e6c79262aaab65bb5f5df8930360]
Sep-25 15:44:34.563 [Task submitter] DEBUG n.processor.TaskPollingMonitor - %% executor local > tasks in the submission queue: 10 -- tasks to be submitted are shown below
~> TaskHandler[id: 26; name: make_uBAM (sample2 : normal_DNA); status: NEW; exit: -; error: -; workDir: /QRISdata/Q6373/results/results_1/work/79/143b9f1f74e8db822efe77d3f7c653]
~> TaskHandler[id: 29; name: make_uBAM (sample1 : normal_DNA); status: NEW; exit: -; error: -; workDir: /QRISdata/Q6373/results/results_1/work/a2/a089c415fad2d56cf6abc00f657ffe]
~> TaskHandler[id: 18; name: make_uBAM (sample1 : tumor_DNA); status: NEW; exit: -; error: -; workDir: /QRISdata/Q6373/results/results_1/work/d5/664efa51d24be72c37b38855b69b6a]
~> TaskHandler[id: 19; name: make_uBAM (sample2 : tumor_DNA); status: NEW; exit: -; error: -; workDir: /QRISdata/Q6373/results/results_1/work/c7/73aaf11c16275e3ef7678d1810ac0e]
~> TaskHandler[id: 25; name: Bwa (sample2 : normal_DNA); status: NEW; exit: -; error: -; workDir: /QRISdata/Q6373/results/results_1/work/46/9bea87330f9e6abb241f98f73bc426]
~> TaskHandler[id: 38; name: mixcr (sample2 : tumor_DNA); status: NEW; exit: -; error: -; workDir: /QRISdata/Q6373/results/results_1/work/52/ecf74a82aff60aa6a7321807b1d01c]
~> TaskHandler[id: 42; name: mixcr (sample1 : normal_DNA); status: NEW; exit: -; error: -; workDir: /QRISdata/Q6373/results/results_1/work/f3/fd7b1b2f102fcb967e9d6ef665ae4f]
~> TaskHandler[id: 41; name: mixcr (sample2 : normal_DNA); status: NEW; exit: -; error: -; workDir: /QRISdata/Q6373/results/results_1/work/e3/ac5451a02a26825da75ea085a49583]
~> TaskHandler[id: 37; name: mixcr (sample1 : tumor_DNA); status: NEW; exit: -; error: -; workDir: /QRISdata/Q6373/results/results_1/work/0b/db949982d4b866e1a4ef77e75dc083]
~> TaskHandler[id: 83; name: Bwa (sample2 : tumor_DNA); status: NEW; exit: -; error: -; workDir: /QRISdata/Q6373/results/results_1/work/a6/cb11d2e03fcbad604abc04ac742277]
Sep-25 15:49:33.151 [Task monitor] DEBUG n.processor.TaskPollingMonitor - !! executor local > tasks to be completed: 2 -- submitted tasks are shown below
~> TaskHandler[id: 28; name: Bwa (sample1 : normal_DNA); status: RUNNING; exit: -; error: -; workDir: /QRISdata/Q6373/results/results_1/work/f3/15a0370e23d72ed33088db22c6a446]
~> TaskHandler[id: 17; name: Bwa (sample1 : tumor_DNA); status: RUNNING; exit: -; error: -; workDir: /QRISdata/Q6373/results/results_1/work/43/c4e6c79262aaab65bb5f5df8930360]
Sep-25 15:49:35.150 [Task submitter] DEBUG n.processor.TaskPollingMonitor - %% executor local > tasks in the submission queue: 10 -- tasks to be submitted are shown below
~> TaskHandler[id: 26; name: make_uBAM (sample2 : normal_DNA); status: NEW; exit: -; error: -; workDir: /QRISdata/Q6373/results/results_1/work/79/143b9f1f74e8db822efe77d3f7c653]
~> TaskHandler[id: 29; name: make_uBAM (sample1 : normal_DNA); status: NEW; exit: -; error: -; workDir: /QRISdata/Q6373/results/results_1/work/a2/a089c415fad2d56cf6abc00f657ffe]
~> TaskHandler[id: 18; name: make_uBAM (sample1 : tumor_DNA); status: NEW; exit: -; error: -; workDir: /QRISdata/Q6373/results/results_1/work/d5/664efa51d24be72c37b38855b69b6a]
~> TaskHandler[id: 19; name: make_uBAM (sample2 : tumor_DNA); status: NEW; exit: -; error: -; workDir: /QRISdata/Q6373/results/results_1/work/c7/73aaf11c16275e3ef7678d1810ac0e]
~> TaskHandler[id: 25; name: Bwa (sample2 : normal_DNA); status: NEW; exit: -; error: -; workDir: /QRISdata/Q6373/results/results_1/work/46/9bea87330f9e6abb241f98f73bc426]
~> TaskHandler[id: 38; name: mixcr (sample2 : tumor_DNA); status: NEW; exit: -; error: -; workDir: /QRISdata/Q6373/results/results_1/work/52/ecf74a82aff60aa6a7321807b1d01c]
~> TaskHandler[id: 42; name: mixcr (sample1 : normal_DNA); status: NEW; exit: -; error: -; workDir: /QRISdata/Q6373/results/results_1/work/f3/fd7b1b2f102fcb967e9d6ef665ae4f]
~> TaskHandler[id: 41; name: mixcr (sample2 : normal_DNA); status: NEW; exit: -; error: -; workDir: /QRISdata/Q6373/results/results_1/work/e3/ac5451a02a26825da75ea085a49583]
~> TaskHandler[id: 37; name: mixcr (sample1 : tumor_DNA); status: NEW; exit: -; error: -; workDir: /QRISdata/Q6373/results/results_1/work/0b/db949982d4b866e1a4ef77e75dc083]
~> TaskHandler[id: 83; name: Bwa (sample2 : tumor_DNA); status: NEW; exit: -; error: -; workDir: /QRISdata/Q6373/results/results_1/work/a6/cb11d2e03fcbad604abc04ac742277]
Sep-25 15:49:39.034 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 28; name: Bwa (sample1 : normal_DNA); status: COMPLETED; exit: 137; error: -; workDir: /QRISdata/Q6373/results/results_1/work/f3/15a0370e23d72ed33088db22c6a446]
Sep-25 15:49:39.035 [Task monitor] INFO  nextflow.processor.TaskProcessor - [f3/15a037] NOTE: Process `Bwa (sample1 : normal_DNA)` terminated with an error exit status (137) -- Execution is retried (1)
Sep-25 15:49:39.062 [Task submitter] DEBUG n.executor.local.LocalTaskHandler - Launch cmd line: /bin/bash -ue .command.run
Sep-25 15:49:39.062 [Task submitter] INFO  nextflow.Session - [46/9bea87] Submitted process > Bwa (sample2 : normal_DNA)

I went through these working directories but there's no other helpful logs available. Much thanks!

riederd commented 10 months ago

can you post the contents of /QRISdata/Q6373/results/results_1/work/f3/15a0370e23d72ed33088db22c6a446 as tar.gz?

amoschoomy commented 10 months ago

work.tar.gz

Here is the files packaged into the tar.gz format

riederd commented 10 months ago

Thanks!

from the .command.err log file you can see that the sambamba command was killed by your system:

38 Killed | sambamba sort --sort-picard --tmpdir=/scratch/temp/5845503 -m 64G -l 6 -t 8 -o sample1_normal_DNA_aligned.bam /dev/stdin

How much memory does your system have? I it has <=64GB you might try to lower the requested memory for sambamba (and possible also other tools) in conf/params.config:

e.g.: SB_sort_mem = "32G"

around line 146

amoschoomy commented 10 months ago

I am running it in my HPC and had allocated 64GB for the memory. I will try to allocate more memory to the cluster or do what you suggested. Thanks!

amoschoomy commented 10 months ago

Hello again, it looks like my real issue was actually in #49. It runs into the same error as #49 . I have now allocated 500GB ram on my cluster and let see how it goes.