TheJacksonLaboratory / splicing-pipelines-nf

Repository for the Anczukow-Lab splicing pipeline
14 stars 9 forks source link

Required cpus too high? #302

Closed angarb closed 2 years ago

angarb commented 2 years ago

While running the pipeline, I am seeing errors like:

Jan-13 16:29:12.274 [pool-7-thread-1] DEBUG nextflow.scheduler.Autoscaler - ### The following tasks have been waiting for more than 5m -- required cpus=98; taskIds=1,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100
Jan-13 16:29:12.677 [pool-7-thread-1] DEBUG nextflow.scheduler.Autoscaler - ### Requesting 49 instance(s) of type: id=n1-standard-2; cpus=2; mem=7.5 GB; disk=10 GB x 128 -- missing-cpus: 98; node-needed: 49; current-size: 1; max-size: 2147483647
Jan-13 16:30:36.940 [pool-6-thread-2] DEBUG nextflow.executor.IgScriptTask - Completed task > gen3_drs_fasp (GTEX-XXXXX-2526-SM-XXXXX.Aligned.sortedByCoord.out.patched.md.bam) -- taskId=2; exitStatus=125
Jan-13 16:30:36.969 [pool-6-thread-2] DEBUG n.executor.IgFileStagingStrategy - Unstaging file names: [*.bam, command-logs-*]

and

docker: failed to register layer: Error processing tar file(exit status 2): fatal error: runtime: out of memory

We will have to address this possibly by lowering the max_cpus and removing the memory allocation in some processes

Vlad-Dembrovskyi commented 2 years ago

Should be fixed by #305

Looks like it: https://cloudos.lifebit.ai/public/jobs/61e972878c574a01e8db7fa7

Pending a big test with 100 samples to proove. @angarb to redo this job: https://cloudos.lifebit.ai/public/jobs/61e0511a8c574a01e8d7b022

Vlad-Dembrovskyi commented 2 years ago

So, the n1-standard-4 works with 100 files fine: https://cloudos.lifebit.ai/app/jobs/61f2d25a8c574a01e8e17df9

But for some reason n2-high-cpu-4 doesn't want to: https://cloudos.lifebit.ai/app/jobs/61f2ce198c574a01e8e17174

Needs further investigation. Merging linked PR for now.

angarb commented 2 years ago

@Vlad-Dembrovskyi - did we close this?