FelixKrueger / Bismark

A tool to map bisulfite converted sequence reads and determine cytosine methylation states
http://felixkrueger.github.io/Bismark/
GNU General Public License v3.0
394 stars 103 forks source link

Mapping efficiency: 0.0%; bowtie2-align exited with value 141 #607

Open nstepi62 opened 1 year ago

nstepi62 commented 1 year ago

Hi,

I'm working with short-read, enzymatic converted human genome reads produced by nanopore. The expected read-lenght is about 150-200bp; + additional Illumina adaptors + nanopore adaptors; so i trimmed the reads using porechop_ABI. Now I try to analyze them with regards to methylation. I figured that bismark should work reasonably for enzymatically converted DNA since the expected pattern is the same.

Conversion of the reference genome (hg19) worked well and resulted in two equally sized files. I used the suggested standard command to initiate bismark and added the -N 1 flag later on trying to solve the issue: bismark -N 1 --genome_folder path/to/reference_hg19/ combined2.fastq.gz

Bismark seems to run well until: Found first alignment: aef9bb32-00e3-49c4-8c36-52301f933b67_runid=xxxxxxxxxxxxxx_read=24_ch=425_start_time=2023- 07-12T14:52:00.375870+02:00_flow_cell_id=XXX_sample_id=xxx_parent_read_id=aef9bb3
ATTTTTAATTATTTTAATTTAGTAAGTTTTTTTGTAATAATTTATAA //0/-&%%(('$%,'',,-.',/2:;965>=;92844620/24/+%% YT:Z:UU

Now starting the Bowtie 2 aligner for CTreadGAgenome (reading in sequences from combined2.fastq.gz_C_to_T.fastq with options -q -N 1 --score-min L,0,-0.2 --ignore-quals --nofw) Using Bowtie 2 index: path/to/reference_hg19/Bisulfite_Genome/GA_conversion/BS_GA

Found first alignment: aef9bb32-00e3-49c4-8c36-52301f933b67_runid=xxxxxxxxxxxxx_read=24_ch=425_start_time=2023- 07-12T14:52:00.375870+02:00_flow_cell_id=xxxxxxxxx_sample_id=xxxxxxxx_parent_read_id=aef9bb3
ATTTTTAATTATTTTAATTTAGTAAGTTTTTTTGTAATAATTTATAA //0/-&%%(('$%,'',,-.',/2:;965>=;92844620/24/+%% YT:Z:UU

Writing bisulfite mapping results to combined2_bismark_bt2.bam <<<

The temporary combined2.fastq.gz_C_to_T.fastq file seems to have a reasonable size (10.254.614KB)

And I end up with the following, including the bowtie2-align exited with value 141 - error

Final Alignment report

Sequences analysed in total: 13994963 Number of alignments with a unique best hit from the different alignments: 0 Mapping efficiency: 0.0%

Sequences with no alignments under any condition: 13994963 Sequences did not map uniquely: 0 Sequences which were discarded because genomic sequence could not be extracted: 0

Number of sequences with unique best (first) alignment came from the bowtie output: CT/CT: 0 ((converted) top strand) CT/GA: 0 ((converted) bottom strand) GA/CT: 0 (complementary to (converted) top strand) GA/GA: 0 (complementary to (converted) bottom strand)

Number of alignments to (merely theoretical) complementary strands being rejected in total: 0

Final Cytosine Methylation Report

Total number of C's analysed: 0

Total methylated C's in CpG context: 0 Total methylated C's in CHG context: 0 Total methylated C's in CHH context: 0 Total methylated C's in Unknown context: 0

Total unmethylated C's in CpG context: 0 Total unmethylated C's in CHG context: 0 Total unmethylated C's in CHH context: 0 Total unmethylated C's in Unknown context: 0

Can't determine percentage of methylated Cs in CpG context if value was 0 Can't determine percentage of methylated Cs in CHG context if value was 0 Can't determine percentage of methylated Cs in CHH context if value was 0 Can't determine percentage of methylated Cs in Unknown context (CN or CHN) if value was 0

Bismark completed in 0d 0h 4m 27s

==================== Bismark run complete

(ERR): bowtie2-align exited with value 141 (ERR): bowtie2-align exited with value 141

I've read the other threads concerning the error and checked my versions of all programms involved

Bismark v0.24.1 bowtie2-align-s version 2.5.1 samtools 1.17 and samtools 1.18 (the 1.17 was the one I got via installing bismark within a separate conda environment from git-hub; 1.18 after it didn't work and I looked for updates)

As far as I know they are now the most up-to date ones.

Some of my reads are really short, so I wonder if this might be the issue Nanopore sequencing is also error prone, so maybe I'll need to increase the tolerance, however, looking at the --score-min options (--score-min L,0,-0.2 --ignore-quals); I really wasn't sure how to proceed. I performed a fast-QC on the input files, seeming to indicate that the conversion took place

image

I wonder if you have any suggestions what kind of big issue I'm facing but not seeing? I'm grateful for any suggestions :)

FelixKrueger commented 1 year ago

I am soooo sorry, I seem to have missed this issue entirely. The error 141 seems to come up every so often, and seems to go away mostly when Bowtie2 or Samtools versions are getting updated. Did you get a chance to try it out on a different machine in the meantime?

As another suggestion, for Nanopore data you might want to look at running Bismark with minimap2 instead of Bowtie2? We have used this successfully for Nanopore data and EM-seq data. Sorry again for missing this issue...