FelixKrueger / Bismark

A tool to map bisulfite converted sequence reads and determine cytosine methylation states
http://felixkrueger.github.io/Bismark/
GNU General Public License v3.0
394 stars 103 forks source link

bismark runs to an error when I try to run it using nextflow #683

Closed Ephantus-Wambui closed 4 months ago

Ephantus-Wambui commented 4 months ago

I am trying to create an EMseq nextflow pipeline but I keep running into this error when I try to do alignments for my reads:

N E X T F L O W ~ version 24.04.3

Launching EMseq_pipeline.nf [infallible_hugle] DSL2 - revision: 1fa21e8535

executor > local (1) [9b/dde1d4] process > highRead_alignment (1) [ 0%] 0 of 1 ERROR ~ Error executing process > 'highRead_alignment (1)'

Caused by: Process highRead_alignment (1) terminated with an error exit status (141)

Command executed:

bismark --un --ambiguous --genome genome_test -1 A56_R1_course2_val_1.fq.gz -2 A56_R2_course2_val_2.fq.gz

Command exit status: 141

Command output: chr CM066671.1 (34436705 bp)

Command error: Bowtie 2 seems to be working fine (tested command 'bowtie2 --version' [2.4.1]) Output format is BAM (default) Alignments will be written out in BAM format. Samtools found here: '/home/ephantus/anaconda3/envs/methylation/bin/samtools' Reference genome folder provided is genome_test/ (absolute path is '/home/ephantus/Desktop/GitHub/EMseq_nextflow_pipeline/data/genome_test/)' FastQ format assumed (by default)

Here is a snippet of the code I'm trying to run:

// Run Bismark high reads alignment process process highRead_alignment { // Copy the output to the highReads_bam directory from work directory publishDir("${params.genomeHighBAM}", mode: 'copy') // Define the number of CPUs to use cpus 4 // Define the memory to use memory 6.GB // Define the input input: tuple val(read_id), path(fastq1), path(fastq2) // genome directory path genomeDir

output:
path "*_bismark_bt2_pe.bam", emit: aligned_bam_files
path "*_bismark_bt2_PE_report.txt", emit: bismark_reports

script:
"""
bismark --un --ambiguous --genome $genomeDir -1 ${fastq1} -2 ${fastq2}
"""

}

workflow { // Define channels genome_prep = Channel.fromPath(params.genomeDir) high_trimmed = Channel.fromFilePairs(params.hightrimmedreads, flat: true)

// run read alignment process for high reads
highRead_alignment(high_trimmed , genome_prep)

}

But when I try to run this alignment using a bash script, the script runs without any errors, what could I be doing wrong? Thank you for your help.

FelixKrueger commented 4 months ago

I am not exactly sure where the error 141 is coming from, some people seem have issues like this in the past. It has been suggested that it could have to do with versions of Samtools and Bowtie2 being used, see here: https://github.com/FelixKrueger/Bismark/issues/64

Any chance that the nextflow workflow and the local bash script use different versions of the above mentioned software?

Ephantus-Wambui commented 4 months ago

I am not exactly sure where the error 141 is coming from, some people seem have issues like this in the past. It has been suggested that it could have to do with versions of Samtools and Bowtie2 being used, see here: #64

Any chance that the nextflow workflow and the local bash script use different versions of the above mentioned software?

No both nextflow workflow and local bash script use the same versions of bismark, actually I'm using the same conda environment to run the nextflow script and bash script

FelixKrueger commented 4 months ago

I wonder if this question isn't maybe better asked in some kind of Nextflow forum (e.g. here: https://community.seqera.io/), I am pretty sure it isn't a Bismark issue as such.

I have also noticed that your version of Bowtie2 is already some 4 years out of date, maybe this or samtools have difficulties when being run from within Nextflow somehow? Just out of interest, have you tried running the nf-core methylseq workflow in EM-seq mode (https://nf-co.re/methylseq/2.6.0/) on your data to see if that works?

Ephantus-Wambui commented 4 months ago

I wonder if this question isn't maybe better asked in some kind of Nextflow forum (e.g. here: https://community.seqera.io/), I am pretty sure it isn't a Bismark issue as such.

I have also noticed that your version of Bowtie2 is already some 4 years out of date, maybe this or samtools have difficulties when being run from within Nextflow somehow? Just out of interest, have you tried running the nf-core methylseq workflow in EM-seq mode (https://nf-co.re/methylseq/2.6.0/) on your data to see if that works?

I updated my bowtie2 and bismark, and the pipeline run fine with no errors, the issue was bowtie2 as you had suggested was 4 years old and once I updated it everything is working now. Thank you for that.