epi2me-labs / wf-transcriptomes

Other
64 stars 30 forks source link

Error about minimap2 #77

Closed Musketeer-D closed 3 weeks ago

Musketeer-D commented 3 months ago

Operating System

Ubuntu 22.04

Other Linux

No response

Workflow Version

v1.1.1

Workflow Execution

Command line

EPI2ME Version

No response

CLI command run

/root/software/nextflow run epi2me-labs/wf-transcriptomes \ --fastq /mnt/32ac0a57-4519-4a01-8234-37c7fb4537e7/sgbio/batch1/fastq_pass \ --de_analysis --ref_genome /mnt/32ac0a57-4519-4a01-8234-37c7fb4537e7/sgbio/Homo_sapiens.GRCh38.dna.primary_assembly.fa \ --transcriptome-source reference-guided \ --ref_annotation /mnt/32ac0a57-4519-4a01-8234-37c7fb4537e7/sgbio/Homo_sapiens.GRCh38.110.chr.gff3 \ --minimap2_index_opts '-k 15' --sample_sheet ./sample_sheet-1.csv \ --threads 1 -c mem.config --cdna_kit SQK-PCS109 --out_dir /mnt/32ac0a57-4519-4a01-8234-37c7fb4537e7/sgbio/deg/test date

mem.config process { withName: build_minimap_index_transcriptome { memory = 32.GB } }

Workflow Execution - CLI Execution Profile

None

What happened?

Command error: .command.sh: line 2: 31 Killed minimap2 -t "1" -k 15 -I 1000G -d "genome_index.mmi" "final_non_redundant_transcriptome.fasta"

Relevant log output

Mar-16 15:52:10.470 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 66; name: output (14); status: COMPLETED; exit: 0; error: -; workDir: /mnt/32ac0a57-4519-4a01-8234-37c7fb4537e7/sgbio/deg/work/fe/b707268dc1aa50f940c96142421dd6]
Mar-16 15:52:10.659 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 65; name: output (13); status: COMPLETED; exit: 0; error: -; workDir: /mnt/32ac0a57-4519-4a01-8234-37c7fb4537e7/sgbio/deg/work/1f/dd06196548179b676430370567216f]
Mar-16 15:52:10.756 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 68; name: output (16); status: COMPLETED; exit: 0; error: -; workDir: /mnt/32ac0a57-4519-4a01-8234-37c7fb4537e7/sgbio/deg/work/9b/e09a91b87790e416df7ced3946d246]
Mar-16 15:52:10.862 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 67; name: output (15); status: COMPLETED; exit: 0; error: -; workDir: /mnt/32ac0a57-4519-4a01-8234-37c7fb4537e7/sgbio/deg/work/bf/73c590a5ad5ef3fd82140fbf484983]
Mar-16 15:52:12.004 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 69; name: output (17); status: COMPLETED; exit: 0; error: -; workDir: /mnt/32ac0a57-4519-4a01-8234-37c7fb4537e7/sgbio/deg/work/c4/47cbcb1dfb67298e710ecdaa4b3eea]
Mar-16 15:52:12.279 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 70; name: output (18); status: COMPLETED; exit: 0; error: -; workDir: /mnt/32ac0a57-4519-4a01-8234-37c7fb4537e7/sgbio/deg/work/98/3a32949f97aba9e5dd4c91a1fe26a3]
Mar-16 15:52:12.482 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 72; name: output (20); status: COMPLETED; exit: 0; error: -; workDir: /mnt/32ac0a57-4519-4a01-8234-37c7fb4537e7/sgbio/deg/work/26/645fb73f8272c9b93014f49165c4d1]
Mar-16 15:52:12.799 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 75; name: output (23); status: COMPLETED; exit: 0; error: -; workDir: /mnt/32ac0a57-4519-4a01-8234-37c7fb4537e7/sgbio/deg/work/23/77fad33075498fb3bd43a558ef26d9]
Mar-16 15:52:12.843 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 81; name: output (29); status: COMPLETED; exit: 0; error: -; workDir: /mnt/32ac0a57-4519-4a01-8234-37c7fb4537e7/sgbio/deg/work/ef/1d2bee7cda47289764007f201aa0d5]
Mar-16 15:52:13.147 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 76; name: output (24); status: COMPLETED; exit: 0; error: -; workDir: /mnt/32ac0a57-4519-4a01-8234-37c7fb4537e7/sgbio/deg/work/1c/b6f2d7e0a4b1a0e36f9df6e193f0b9]
Mar-16 15:52:13.795 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 73; name: output (21); status: COMPLETED; exit: 0; error: -; workDir: /mnt/32ac0a57-4519-4a01-8234-37c7fb4537e7/sgbio/deg/work/6b/02127160d48015612f00342bc6ec09]
Mar-16 15:52:14.293 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 71; name: output (19); status: COMPLETED; exit: 0; error: -; workDir: /mnt/32ac0a57-4519-4a01-8234-37c7fb4537e7/sgbio/deg/work/1b/12208fea4ee3e02d24094fedbc6848]
Mar-16 15:52:15.152 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 82; name: output (30); status: COMPLETED; exit: 0; error: -; workDir: /mnt/32ac0a57-4519-4a01-8234-37c7fb4537e7/sgbio/deg/work/90/886e44882fce4408fc33ad3940299c]
Mar-16 15:52:15.268 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 74; name: output (22); status: COMPLETED; exit: 0; error: -; workDir: /mnt/32ac0a57-4519-4a01-8234-37c7fb4537e7/sgbio/deg/work/67/159e00078bdb48c388e828016a9b7f]
Mar-16 15:52:15.801 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 83; name: output (31); status: COMPLETED; exit: 0; error: -; workDir: /mnt/32ac0a57-4519-4a01-8234-37c7fb4537e7/sgbio/deg/work/92/a6e6f497911752d9abba4f520d3d9e]
Mar-16 15:52:16.085 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 79; name: output (27); status: COMPLETED; exit: 0; error: -; workDir: /mnt/32ac0a57-4519-4a01-8234-37c7fb4537e7/sgbio/deg/work/2e/1cef51af8e2096446d44779d212fae]
Mar-16 15:52:16.289 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 84; name: output (32); status: COMPLETED; exit: 0; error: -; workDir: /mnt/32ac0a57-4519-4a01-8234-37c7fb4537e7/sgbio/deg/work/6a/35d7278355ccfcf1b344dde36d5aa1]
Mar-16 15:52:16.637 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 77; name: output (25); status: COMPLETED; exit: 0; error: -; workDir: /mnt/32ac0a57-4519-4a01-8234-37c7fb4537e7/sgbio/deg/work/88/f2b50f7c6121a5efbef1e67edf07ca]
Mar-16 15:52:17.092 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 80; name: output (28); status: COMPLETED; exit: 0; error: -; workDir: /mnt/32ac0a57-4519-4a01-8234-37c7fb4537e7/sgbio/deg/work/77/97cbc6e4d8e11da01cbab9aa816313]
Mar-16 15:52:17.155 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 78; name: output (26); status: COMPLETED; exit: 0; error: -; workDir: /mnt/32ac0a57-4519-4a01-8234-37c7fb4537e7/sgbio/deg/work/9c/fcc4f35c265c0f99b1f5bbeea5c223]
Mar-16 15:54:48.427 [Task monitor] DEBUG n.processor.TaskPollingMonitor - !! executor local > tasks to be completed: 1 -- submitted tasks are shown below
~> TaskHandler[id: 56; name: pipeline:merge_transcriptomes (1); status: RUNNING; exit: -; error: -; workDir: /mnt/32ac0a57-4519-4a01-8234-37c7fb4537e7/sgbio/deg/work/91/09fd2bdde719aa699b71506b20d422]
Mar-16 15:54:57.581 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 56; name: pipeline:merge_transcriptomes (1); status: COMPLETED; exit: 0; error: -; workDir: /mnt/32ac0a57-4519-4a01-8234-37c7fb4537e7/sgbio/deg/work/91/09fd2bdde719aa699b71506b20d422]
Mar-16 15:54:57.616 [Task submitter] DEBUG n.executor.local.LocalTaskHandler - Launch cmd line: /bin/bash -ue .command.run
Mar-16 15:54:57.616 [Task submitter] INFO  nextflow.Session - [62/ef8862] Submitted process > pipeline:differential_expression:build_minimap_index_transcriptome (1)
Mar-16 15:58:47.951 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 85; name: pipeline:differential_expression:build_minimap_index_transcriptome (1); status: COMPLETED; exit: 137; error: -; workDir: /mnt/32ac0a57-4519-4a01-8234-37c7fb4537e7/sgbio/deg/work/62/ef8862158c3644155267281a76e7a5]
Mar-16 15:58:47.955 [Task monitor] DEBUG nextflow.processor.TaskProcessor - Handling unexpected condition for
  task: name=pipeline:differential_expression:build_minimap_index_transcriptome (1); work-dir=/mnt/32ac0a57-4519-4a01-8234-37c7fb4537e7/sgbio/deg/work/62/ef8862158c3644155267281a76e7a5
  error [nextflow.exception.ProcessFailedException]: Process `pipeline:differential_expression:build_minimap_index_transcriptome (1)` terminated with an error exit status (137)
Mar-16 15:58:47.968 [Task monitor] ERROR nextflow.processor.TaskProcessor - Error executing process > 'pipeline:differential_expression:build_minimap_index_transcriptome (1)'

Caused by:
  Process `pipeline:differential_expression:build_minimap_index_transcriptome (1)` terminated with an error exit status (137)

Command executed:

  minimap2 -t "1" -k 15  -I 1000G -d "genome_index.mmi" "final_non_redundant_transcriptome.fasta"

Command exit status:
  137

Command output:
  (empty)

Command error:
  .command.sh: line 2:    31 Killed                  minimap2 -t "1" -k 15 -I 1000G -d "genome_index.mmi" "final_non_redundant_transcriptome.fasta"

Work dir:
  /mnt/32ac0a57-4519-4a01-8234-37c7fb4537e7/sgbio/deg/work/62/ef8862158c3644155267281a76e7a5

Tip: you can replicate the issue by changing to the process work dir and entering the command `bash .command.run`
Mar-16 15:58:47.971 [Task monitor] DEBUG nextflow.Session - Session aborted -- Cause: Process `pipeline:differential_expression:build_minimap_index_transcriptome (1)` terminated with an error exit status (137)
Mar-16 15:58:47.990 [Task monitor] DEBUG nextflow.Session - The following nodes are still active:
[process] output

Mar-16 15:58:49.183 [main] DEBUG nextflow.Session - Session await > all processes finished
Mar-16 15:58:49.183 [main] DEBUG nextflow.Session - Session await > all barriers passed
Mar-16 15:58:49.232 [Task monitor] DEBUG n.processor.TaskPollingMonitor - <<< barrier arrives (monitor: local) - terminating tasks monitor poll loop
Mar-16 15:58:49.531 [main] DEBUG n.trace.WorkflowStatsObserver - Workflow completed > WorkflowStats[succeededCount=84; failedCount=1; ignoredCount=0; cachedCount=0; pendingCount=5; submittedCount=0; runningCount=0; retriesCount=0; abortedCount=0; succeedDuration=2h 42m 23s; failedDuration=3m 50s; cachedDuration=0ms;loadCpus=0; loadMemory=0; peakRunning=21; peakCpus=22; peakMemory=93 GB; ]
Mar-16 15:58:49.531 [main] DEBUG nextflow.trace.TraceFileObserver - Workflow completed -- saving trace file
Mar-16 15:58:49.535 [main] DEBUG nextflow.trace.ReportObserver - Workflow completed -- rendering execution report
Mar-16 15:58:50.169 [main] DEBUG nextflow.trace.TimelineObserver - Workflow completed -- rendering execution timeline
Mar-16 15:58:50.263 [main] DEBUG nextflow.cache.CacheDB - Closing CacheDB done
Mar-16 15:58:50.279 [main] DEBUG nextflow.script.ScriptRunner - > Execution complete -- Goodbye
(base) root@david-PowerEdge-R740xd:/mnt/32ac0a57-4519-4a01-8234-37c7fb4537e7/sgbio/deg#

Application activity log entry

No response

Were you able to successfully run the latest version of the workflow with the demo data?

yes

Other demo data information

No response

sarahjeeeze commented 3 months ago

Hi, sorry about that, looks like it is out of memory - do you know how large the non redundant transcriptome file is? I will try to recreate the error. Is 32GB the maximum you have available, if not could you try with more?

Musketeer-D commented 3 months ago

I am using the human genome GRCh38. I think the non-redundant transcriptome file is about 429M, as you can see: ll -ht /Users/pricedavid/dataofdavid/code/genome/Homo_sapiens.GRCh38.cdna.all.fa -rw-rw-rw-@ 1 pricedavid staff 429M Dec 7 12:54 /Users/pricedavid/dataofdavid/code/genome/Homo_sapiens.GRCh38.cdna.all.fa

I cannot run wf-transcriptomes with or without the mem.config parameter (I added the mem.config option as you advised in another issue here).

I ran wf-transcriptomes on a DELL server, which has 256GB RAM and 10TB ROM (no other job was running at the time I ran wf-transcriptomes), so I am quite confused as to why wf-transcriptomes could run out of memory.

Thank you for your kind help !

sarahjeeeze commented 3 months ago

Hi, the file you shared isn't the non redundant transcriptomes, are you able to navigate to the work directory /mnt/32ac0a57-4519-4a01-8234-37c7fb4537e7/sgbio/deg/work/62/ef8862158c3644155267281a76e7a5 and check the file size. We have some memory improvements for minimap2 that we will apply in a future release.

If you are only interested in looking at differential expression of the transcriptome Homo_sapiens.GRCh38.cdna.all.fa you could select the --transcriptome_source precomputed parameter, which will bypass creating the new transcriptome (which can often be large).

sarahjeeeze commented 3 weeks ago

Closing through lack of response