nf-core / methylseq

Methylation (Bisulfite-Sequencing) analysis pipeline using Bismark or bwa-meth + MethylDackel
https://nf-co.re/methylseq
MIT License
137 stars 136 forks source link

Multiple issues in the pipeline #406

Closed gevro closed 2 months ago

gevro commented 2 months ago

Description of the bug

Hi, in the dev version there are multiple issues:

  1. The pipeline seems to be running an BISMARK_ALIGN step for the first sample as many times as the total number of samples, instead of a separate BISMARK_ALIGN step for each sample.

  2. The BISMARK_ALIGN step says it cannot find the mm10.fa reference file. But it was definitely configured correctly and the file is present:

Error:

-[nf-core/methylseq] Pipeline completed with errors-
ERROR ~ Error executing process > 'NFCORE_METHYLSEQ:METHYLSEQ:BISMARK:BISMARK_ALIGN (M1)'

Caused by:
  Process `NFCORE_METHYLSEQ:METHYLSEQ:BISMARK:BISMARK_ALIGN (M1)` terminated with an error exit status (2)

Command executed:

  bismark \
      -1 M1_1_val_1.fq.gz -2 M1_2_val_2.fq.gz \
      --genome BismarkIndex \
      --bam \
      --bowtie2         --maxins 1000 --multicore 4

  cat <<-END_VERSIONS > versions.yml
  "NFCORE_METHYLSEQ:METHYLSEQ:BISMARK:BISMARK_ALIGN":
      bismark: $(echo $(bismark -v 2>&1) | sed 's/^.*Bismark Version: v//; s/Copyright.*$//')
  END_VERSIONS

Command exit status:
  2

Command output:
  (empty)

Command error:
  Bowtie 2 seems to be working fine (tested command 'bowtie2 --version' [2.4.5])
  Output format is BAM (default)
  Alignments will be written out in BAM format. Samtools found here: '/usr/local/bin/samtools'
  Reference genome folder provided is BismarkIndex/     (absolute path is '/gpfs/scratch/user/methylation/work/17/9f7fe2cc089132a7e63fbc29c8d339/BismarkIndex/)'
  FastQ format assumed (by default)

  Input files to be analysed (in current folder '/gpfs/scratch/user/methylation/work/58/93334847cd4de313854a32d6a7333a'):
  M1_1_val_1.fq.gz
  M1_2_val_2.fq.gz
  Library is assumed to be strand-specific (directional), alignments to strands complementary to the original top or bottom strands will be ignored (i.e. not performed!)
  Summary of all aligner options:       -q --score-min L,0,-0.2 --ignore-quals --no-mixed --no-discordant --dovetail --maxins 1000
  Running Bismark Parallel version. Number of parallel instances to be spawned: 4

  Current working directory is: /gpfs/scratch/user/methylation/work/58/93334847cd4de313854a32d6a7333a

  Now reading in and storing sequence information of the genome specified in: /gpfs/scratch/user/methylation/work/17/9f7fe2cc089132a7e63fbc29c8d339/BismarkIndex/

  Failed to read from sequence file mm10.fa No such file or directory

Work dir:
  /gpfs/scratch/user/methylation/work/58/93334847cd4de313854a32d6a7333a

But file is present:

$ ls /gpfs/scratch/user/methylation/work/17/9f7fe2cc089132a7e63fbc29c8d339/BismarkIndex/
Bisulfite_Genome  mm10.fa

Command used and terminal output

No response

Relevant files

No response

System information

Nextflow 23.04.4

Executor: slurm

gevro commented 2 months ago

Hi, I found the solution, had to add this to the nextflow config: process.stageInMode = 'copy'