BenLangmead / bowtie2

A fast and sensitive gapped read aligner
GNU General Public License v3.0
638 stars 160 forks source link

Does End-to-End option override option with letter "L" ? #468

Closed santataRU closed 2 months ago

santataRU commented 2 months ago

Dear Bowtie2 developers,

This is the Bowtie2 code I run bowtie2 --end-to-end --sensitive --score-min L,0,-0.24 -k 1 --n-ceil L,0,0.05 --threads $CPUS -x $GENOME_BOWTIE2_INDEX -U $READ_FILE.

I have questions about this line of code: 1) do two letters of "L" next to "--score-min" and "--n-cell" mean local mode or seed length? I may be confused with the "-L" seed length option with just the letter "L".

2) if the "L" letters in the code mean local mode, then does this contradict the "--end-to-end" option? In this case, which option is going to override the other?

Regards,

Xiao

ch4rr0 commented 2 months ago

Hello,

The L in L,0,-0.24 has nothing to do with local or end-to-end mode but is instead represents a (L)inear function used to calculate the minimum score a read must have to be considered a valid alignment, or the the number of Ns a read can possess to be considered for alignment. Both of these are calculated as a function of the read length. I hope this helps.

Setting function options

Some Bowtie 2 options specify a function rather than an individual number or setting. In these cases the user specifies three parameters: (a) a function type F, (b) a constant term B, and (c) a coefficient A. The available function types are constant (C), linear (L), square-root (S), and natural log (G). The parameters are specified as F,B,A - that is, the function type, the constant term, and the coefficient are separated by commas with no whitespace. The constant term and coefficient may be negative and/or floating-point numbers.

For example, if the function specification is L,-0.4,-0.6, then the function defined is:

f(x) = -0.4 + -0.6 * x

If the function specification is G,1,5.4, then the function defined is:

f(x) = 1.0 + 5.4 * ln(x)