vib-singlecell-nf / vsn-pipelines

A repository of pipelines for single-cell data in Nextflow DSL2
GNU General Public License v3.0
75 stars 32 forks source link

[BUG] scenic multirun stalls during Arboreto with multiprocessing #375

Open Jay-Leung opened 2 years ago

Jay-Leung commented 2 years ago

Describe the bug I was previously using pyscenic to identify regulons in my scRNA-Seq data. To ensure robustness of the regulons identified, I used VSN scenic with multiruns. I tried running the example given in the documentation and it works for nRuns = 2. However, when I used my own dataset and increase nRuns = 100, it stalls (once at 21%, and another at 64%) during the Arboreto with multiprocessing step. The CPU usage percentage drops to 0% so it doesn't seem to be running, even though there is no error shown. I apologise that I could not include the screenshot as I tried to restart the process with -resume, but now it seems to have skipped the Arboreto step with 64% to the next step (add Pearson correlation). I was wondering if this had to do with the config file compute resources, so I increased some of the parameters. I am quite new to this so I am not sure which exact parameter to toggle, so could you kindly take a look at my config file? I am running it on a local workstation, with 16 cores/32 threads, with 120GB RAM.

To Reproduce Steps to reproduce the behavior:

  1. Configure with these options:

    nextflow config \
    -profile hg38,scenic,scenic_multiruns,loom,scenic_use_cistarget_motifs,scenic_use_cistarget_tracks,singularity \
    vib-singlecell-nf/vsn-pipelines > nf.config
  2. Run using this entry point:

    NXF_VER=21.04.3 nextflow -C nf.config run vib-singlecell-nf/vsn-pipelines -entry scenic -r v0.27.0
  3. See error: Stalls at a certain percentage, with 0% CPU usage. I have to Ctl+C to force stop.

Expected behavior Process should continue.

Please complete the following information:

Additional context config.txt execution_trace.txt [angry_roentgen] Nextflow Workflow Report.pdf execution_timeline.pdf

I have attached the config file and execution trace in .txt files, and the reports and timeline in pdf files. The 65th run for arboreto stalled for 5 hours+ before I Ctl+C to force stop. I would appreciate some guidance on the compute resource parameters - specifically, does the cpus in the config refer to number of cores or threads as well?

Thank you so much!