nf-core / ampliseq

Amplicon sequencing analysis workflow using DADA2 and QIIME2
https://nf-co.re/ampliseq
MIT License
182 stars 115 forks source link

RENAME_RAW_DATA_FILES - Process requirement exceeds available CPUs #626

Closed amalacrino closed 1 year ago

amalacrino commented 1 year ago

Hello everybody! I'm experiencing an issue with ampliseq, working on an HPC with SLURM (NextFlow version 23.04.1). The code I'm running is:

nextflow run nf-core/ampliseq -r 2.3.2 -profile docker \
--input $INDIR \
--FW_primer "GTGYCAGCMGCCGCGGTAA" \
--RV_primer "GGACTACNVGGGTWTCTAAT" \
--outdir $OUTDIR \
--illumina_novaseq \
--extension "/*_{1,2}.fastq.gz" \
--skip_qiime \
--skip_barplot \
--skip_abundance_tables \
--skip_alpha_rarefaction \
--skip_diversity_indices \
--skip_ancom \
--ignore_empty_input_files

And the job stops immediately at the first step NFCORE_AMPLISEQ:AMPLISEQ:RENAME_RAW_DATA_FILES with the following error:

ERROR ~ Error executing process > 'NFCORE_AMPLISEQ:AMPLISEQ:RENAME_RAW_DATA_FILES (S035)'
Caused by:
  Process requirement exceeds available CPUs -- req: 2; avail: 1
Command executed:
  [ -f "S035_1.fastq.gz" ] || ln -s "S035_1.fastq.gz" "S035_1.fastq.gz"
  [ -f "S035_2.fastq.gz" ] || ln -s "S035_2.fastq.gz" "S035_2.fastq.gz"

  cat <<-END_VERSIONS > versions.yml
  "NFCORE_AMPLISEQ:AMPLISEQ:RENAME_RAW_DATA_FILES":
      sed: $(sed --version 2>&1 | sed -n 1p | sed 's/sed (GNU sed) //')
  END_VERSIONS
Command exit status:
  -
Command output:
  (empty)

I tried several versions of ampliseq and small tweaks, but nothing seems to work. I also searched on GitHub and in other places, but I haven't found any source discussing about this error. Do you have any idea what might cause it? Happy to provide further info if needed.

Thanks!!

agrier-wcm commented 1 year ago

By default, this pipeline will run processes with various numbers of CPUs from 1 to 16. So, you need to either request 16 CPUs from SLURM, or add the following argument to your nextflow command:

--max_cpus 1

You may encounter a similar error for memory. By default, processes will be run with up to 128GB of memory. So, you need to either request that much memory from SLURM, or add the following argument to your nextflow command:

--max_memory '8.GB'

Change 8 to however many GB of memory your SLURM job actually has.

amalacrino commented 1 year ago

Thanks @agrier-wcm ! I had tried including both of those arguments, but apparently the only combination that really works in my case is 16 CPUs and 128 GB. Perhaps, do I need to keep the CPU/memory proportion? I'm going to investigate, but rn I'm just glad it is working. Thanks a lot!