epi2me-labs / wf-transcriptomes

64 stars 30 forks source link

Error about minimap2 #77

Closed Musketeer-D closed 3 weeks ago

Musketeer-D commented 3 months ago

Operating System

Ubuntu 22.04

Other Linux

No response

Workflow Version


Workflow Execution

Command line

EPI2ME Version

No response

CLI command run

/root/software/nextflow run epi2me-labs/wf-transcriptomes \ --fastq /mnt/32ac0a57-4519-4a01-8234-37c7fb4537e7/sgbio/batch1/fastq_pass \ --de_analysis --ref_genome /mnt/32ac0a57-4519-4a01-8234-37c7fb4537e7/sgbio/Homo_sapiens.GRCh38.dna.primary_assembly.fa \ --transcriptome-source reference-guided \ --ref_annotation /mnt/32ac0a57-4519-4a01-8234-37c7fb4537e7/sgbio/Homo_sapiens.GRCh38.110.chr.gff3 \ --minimap2_index_opts '-k 15' --sample_sheet ./sample_sheet-1.csv \ --threads 1 -c mem.config --cdna_kit SQK-PCS109 --out_dir /mnt/32ac0a57-4519-4a01-8234-37c7fb4537e7/sgbio/deg/test date

mem.config process { withName: build_minimap_index_transcriptome { memory = 32.GB } }

Workflow Execution - CLI Execution Profile


What happened?

Command error: .command.sh: line 2: 31 Killed minimap2 -t "1" -k 15 -I 1000G -d "genome_index.mmi" "final_non_redundant_transcriptome.fasta"

Relevant log output

Mar-16 15:52:10.470 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 66; name: output (14); status: COMPLETED; exit: 0; error: -; workDir: /mnt/32ac0a57-4519-4a01-8234-37c7fb4537e7/sgbio/deg/work/fe/b707268dc1aa50f940c96142421dd6]
Mar-16 15:52:10.659 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 65; name: output (13); status: COMPLETED; exit: 0; error: -; workDir: /mnt/32ac0a57-4519-4a01-8234-37c7fb4537e7/sgbio/deg/work/1f/dd06196548179b676430370567216f]
Mar-16 15:52:10.756 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 68; name: output (16); status: COMPLETED; exit: 0; error: -; workDir: /mnt/32ac0a57-4519-4a01-8234-37c7fb4537e7/sgbio/deg/work/9b/e09a91b87790e416df7ced3946d246]
Mar-16 15:52:10.862 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 67; name: output (15); status: COMPLETED; exit: 0; error: -; workDir: /mnt/32ac0a57-4519-4a01-8234-37c7fb4537e7/sgbio/deg/work/bf/73c590a5ad5ef3fd82140fbf484983]
Mar-16 15:52:12.004 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 69; name: output (17); status: COMPLETED; exit: 0; error: -; workDir: /mnt/32ac0a57-4519-4a01-8234-37c7fb4537e7/sgbio/deg/work/c4/47cbcb1dfb67298e710ecdaa4b3eea]
Mar-16 15:52:12.279 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 70; name: output (18); status: COMPLETED; exit: 0; error: -; workDir: /mnt/32ac0a57-4519-4a01-8234-37c7fb4537e7/sgbio/deg/work/98/3a32949f97aba9e5dd4c91a1fe26a3]
Mar-16 15:52:12.482 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 72; name: output (20); status: COMPLETED; exit: 0; error: -; workDir: /mnt/32ac0a57-4519-4a01-8234-37c7fb4537e7/sgbio/deg/work/26/645fb73f8272c9b93014f49165c4d1]
Mar-16 15:52:12.799 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 75; name: output (23); status: COMPLETED; exit: 0; error: -; workDir: /mnt/32ac0a57-4519-4a01-8234-37c7fb4537e7/sgbio/deg/work/23/77fad33075498fb3bd43a558ef26d9]
Mar-16 15:52:12.843 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 81; name: output (29); status: COMPLETED; exit: 0; error: -; workDir: /mnt/32ac0a57-4519-4a01-8234-37c7fb4537e7/sgbio/deg/work/ef/1d2bee7cda47289764007f201aa0d5]
Mar-16 15:52:13.147 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 76; name: output (24); status: COMPLETED; exit: 0; error: -; workDir: /mnt/32ac0a57-4519-4a01-8234-37c7fb4537e7/sgbio/deg/work/1c/b6f2d7e0a4b1a0e36f9df6e193f0b9]
Mar-16 15:52:13.795 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 73; name: output (21); status: COMPLETED; exit: 0; error: -; workDir: /mnt/32ac0a57-4519-4a01-8234-37c7fb4537e7/sgbio/deg/work/6b/02127160d48015612f00342bc6ec09]
Mar-16 15:52:14.293 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 71; name: output (19); status: COMPLETED; exit: 0; error: -; workDir: /mnt/32ac0a57-4519-4a01-8234-37c7fb4537e7/sgbio/deg/work/1b/12208fea4ee3e02d24094fedbc6848]
Mar-16 15:52:15.152 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 82; name: output (30); status: COMPLETED; exit: 0; error: -; workDir: /mnt/32ac0a57-4519-4a01-8234-37c7fb4537e7/sgbio/deg/work/90/886e44882fce4408fc33ad3940299c]
Mar-16 15:52:15.268 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 74; name: output (22); status: COMPLETED; exit: 0; error: -; workDir: /mnt/32ac0a57-4519-4a01-8234-37c7fb4537e7/sgbio/deg/work/67/159e00078bdb48c388e828016a9b7f]
Mar-16 15:52:15.801 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 83; name: output (31); status: COMPLETED; exit: 0; error: -; workDir: /mnt/32ac0a57-4519-4a01-8234-37c7fb4537e7/sgbio/deg/work/92/a6e6f497911752d9abba4f520d3d9e]
Mar-16 15:52:16.085 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 79; name: output (27); status: COMPLETED; exit: 0; error: -; workDir: /mnt/32ac0a57-4519-4a01-8234-37c7fb4537e7/sgbio/deg/work/2e/1cef51af8e2096446d44779d212fae]
Mar-16 15:52:16.289 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 84; name: output (32); status: COMPLETED; exit: 0; error: -; workDir: /mnt/32ac0a57-4519-4a01-8234-37c7fb4537e7/sgbio/deg/work/6a/35d7278355ccfcf1b344dde36d5aa1]
Mar-16 15:52:16.637 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 77; name: output (25); status: COMPLETED; exit: 0; error: -; workDir: /mnt/32ac0a57-4519-4a01-8234-37c7fb4537e7/sgbio/deg/work/88/f2b50f7c6121a5efbef1e67edf07ca]
Mar-16 15:52:17.092 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 80; name: output (28); status: COMPLETED; exit: 0; error: -; workDir: /mnt/32ac0a57-4519-4a01-8234-37c7fb4537e7/sgbio/deg/work/77/97cbc6e4d8e11da01cbab9aa816313]
Mar-16 15:52:17.155 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 78; name: output (26); status: COMPLETED; exit: 0; error: -; workDir: /mnt/32ac0a57-4519-4a01-8234-37c7fb4537e7/sgbio/deg/work/9c/fcc4f35c265c0f99b1f5bbeea5c223]
Mar-16 15:54:48.427 [Task monitor] DEBUG n.processor.TaskPollingMonitor - !! executor local > tasks to be completed: 1 -- submitted tasks are shown below
~> TaskHandler[id: 56; name: pipeline:merge_transcriptomes (1); status: RUNNING; exit: -; error: -; workDir: /mnt/32ac0a57-4519-4a01-8234-37c7fb4537e7/sgbio/deg/work/91/09fd2bdde719aa699b71506b20d422]
Mar-16 15:54:57.581 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 56; name: pipeline:merge_transcriptomes (1); status: COMPLETED; exit: 0; error: -; workDir: /mnt/32ac0a57-4519-4a01-8234-37c7fb4537e7/sgbio/deg/work/91/09fd2bdde719aa699b71506b20d422]
Mar-16 15:54:57.616 [Task submitter] DEBUG n.executor.local.LocalTaskHandler - Launch cmd line: /bin/bash -ue .command.run
Mar-16 15:54:57.616 [Task submitter] INFO  nextflow.Session - [62/ef8862] Submitted process > pipeline:differential_expression:build_minimap_index_transcriptome (1)
Mar-16 15:58:47.951 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 85; name: pipeline:differential_expression:build_minimap_index_transcriptome (1); status: COMPLETED; exit: 137; error: -; workDir: /mnt/32ac0a57-4519-4a01-8234-37c7fb4537e7/sgbio/deg/work/62/ef8862158c3644155267281a76e7a5]
Mar-16 15:58:47.955 [Task monitor] DEBUG nextflow.processor.TaskProcessor - Handling unexpected condition for
  task: name=pipeline:differential_expression:build_minimap_index_transcriptome (1); work-dir=/mnt/32ac0a57-4519-4a01-8234-37c7fb4537e7/sgbio/deg/work/62/ef8862158c3644155267281a76e7a5
  error [nextflow.exception.ProcessFailedException]: Process `pipeline:differential_expression:build_minimap_index_transcriptome (1)` terminated with an error exit status (137)
Mar-16 15:58:47.968 [Task monitor] ERROR nextflow.processor.TaskProcessor - Error executing process > 'pipeline:differential_expression:build_minimap_index_transcriptome (1)'

Caused by:
  Process `pipeline:differential_expression:build_minimap_index_transcriptome (1)` terminated with an error exit status (137)

Command executed:

  minimap2 -t "1" -k 15  -I 1000G -d "genome_index.mmi" "final_non_redundant_transcriptome.fasta"

Command exit status:

Command output:

Command error:
  .command.sh: line 2:    31 Killed                  minimap2 -t "1" -k 15 -I 1000G -d "genome_index.mmi" "final_non_redundant_transcriptome.fasta"

Work dir:

Tip: you can replicate the issue by changing to the process work dir and entering the command `bash .command.run`
Mar-16 15:58:47.971 [Task monitor] DEBUG nextflow.Session - Session aborted -- Cause: Process `pipeline:differential_expression:build_minimap_index_transcriptome (1)` terminated with an error exit status (137)
Mar-16 15:58:47.990 [Task monitor] DEBUG nextflow.Session - The following nodes are still active:
[process] output

Mar-16 15:58:49.183 [main] DEBUG nextflow.Session - Session await > all processes finished
Mar-16 15:58:49.183 [main] DEBUG nextflow.Session - Session await > all barriers passed
Mar-16 15:58:49.232 [Task monitor] DEBUG n.processor.TaskPollingMonitor - <<< barrier arrives (monitor: local) - terminating tasks monitor poll loop
Mar-16 15:58:49.531 [main] DEBUG n.trace.WorkflowStatsObserver - Workflow completed > WorkflowStats[succeededCount=84; failedCount=1; ignoredCount=0; cachedCount=0; pendingCount=5; submittedCount=0; runningCount=0; retriesCount=0; abortedCount=0; succeedDuration=2h 42m 23s; failedDuration=3m 50s; cachedDuration=0ms;loadCpus=0; loadMemory=0; peakRunning=21; peakCpus=22; peakMemory=93 GB; ]
Mar-16 15:58:49.531 [main] DEBUG nextflow.trace.TraceFileObserver - Workflow completed -- saving trace file
Mar-16 15:58:49.535 [main] DEBUG nextflow.trace.ReportObserver - Workflow completed -- rendering execution report
Mar-16 15:58:50.169 [main] DEBUG nextflow.trace.TimelineObserver - Workflow completed -- rendering execution timeline
Mar-16 15:58:50.263 [main] DEBUG nextflow.cache.CacheDB - Closing CacheDB done
Mar-16 15:58:50.279 [main] DEBUG nextflow.script.ScriptRunner - > Execution complete -- Goodbye
(base) root@david-PowerEdge-R740xd:/mnt/32ac0a57-4519-4a01-8234-37c7fb4537e7/sgbio/deg#

Application activity log entry

No response

Were you able to successfully run the latest version of the workflow with the demo data?


Other demo data information

No response

sarahjeeeze commented 3 months ago

Hi, sorry about that, looks like it is out of memory - do you know how large the non redundant transcriptome file is? I will try to recreate the error. Is 32GB the maximum you have available, if not could you try with more?

Musketeer-D commented 3 months ago

I am using the human genome GRCh38. I think the non-redundant transcriptome file is about 429M, as you can see: ll -ht /Users/pricedavid/dataofdavid/code/genome/Homo_sapiens.GRCh38.cdna.all.fa -rw-rw-rw-@ 1 pricedavid staff 429M Dec 7 12:54 /Users/pricedavid/dataofdavid/code/genome/Homo_sapiens.GRCh38.cdna.all.fa

I cannot run wf-transcriptomes with or without the mem.config parameter (I added the mem.config option as you advised in another issue here).

I ran wf-transcriptomes on a DELL server, which has 256GB RAM and 10TB ROM (no other job was running at the time I ran wf-transcriptomes), so I am quite confused as to why wf-transcriptomes could run out of memory.

Thank you for your kind help !

sarahjeeeze commented 3 months ago

Hi, the file you shared isn't the non redundant transcriptomes, are you able to navigate to the work directory /mnt/32ac0a57-4519-4a01-8234-37c7fb4537e7/sgbio/deg/work/62/ef8862158c3644155267281a76e7a5 and check the file size. We have some memory improvements for minimap2 that we will apply in a future release.

If you are only interested in looking at differential expression of the transcriptome Homo_sapiens.GRCh38.cdna.all.fa you could select the --transcriptome_source precomputed parameter, which will bypass creating the new transcriptome (which can often be large).

sarahjeeeze commented 3 weeks ago

Closing through lack of response