BenLangmead / bowtie

An ultrafast memory-efficient short read aligner
Other
260 stars 77 forks source link

Bowtie not mapping 100% #130

Closed Stakaitis closed 2 years ago

Stakaitis commented 2 years ago

Issue: Bowtie does not recognize a 21 nt length sequence, even though it matches the first 20 nt but only has one extra nucleotide at the very end.

Example: ref.fa (without quotes) ">5P_ath-miR159a_3_2-OMe TTTGGATTGAAGGGAGCTCT"

r1.fq (without quotes) "@A01058:221:HLKLTDSX3:4:1101:1045:1016 1:N:0:NATCAG TTTGGATTGAAGGGAGCTCTA + FFFFFFFFFFFFFFFFFFFFF"

Commands: bowtie-build ref.fa ref bowtie ref r1.fq (also tried various alignment arguments)

Report: reads processed: 1 reads with at least one reported alignment: 0 (0.00%) reads that failed to align: 1 (100.00%) No alignments

Version: version 1.2.3 installed by running conda install -c bioconda bowtie on mac

mschilli87 commented 2 years ago

IIRC, unlike Bowtie 2, BWA and many other mappers, Bowtie does not soft-clip reads. I.e. a missmatch in the beginning/end counts as such. I assume due to the shortness of the sequence, this one mismatch pushes the alignment score below the (default) threshold.

Guessing by the particular length of your read and the sequence name in your example, you might want to have a look at how miRDeep2 maps sRNA-seq reads to pre-miRNAs using Bowtie. :wink: See here for its default number of accepted mismatches. Probably you'll have to optimize the parameters further for your use case of mapping straight to the mature miRNAs.