ksahlin / strobealign

Aligns short reads using dynamic seed size with strobemers
MIT License
150 stars 17 forks source link

Missing alignment #440

Closed yazhinia closed 2 months ago

yazhinia commented 2 months ago

Hi, I observe rarely that a contig has a fractional abundance value with --aemb run but no alignment is found for that contig when output samfile. Could be that that a read found multiple best mapping but when when writing to samfile, only one alignment got selected randomly?

Note: I used --eqx flag for samfile alignment output if it matters.

Thank you.

ksahlin commented 2 months ago

Hi @yazhinia,

Running strobealign in default mode (extension alignment) makes strobealign output a single location per read (if several - one is chosen at random).

The aemb mode will count towards coverage |r|/n if there are n equally good mappings (where |r| is the length of the read).

This can cause a positive coverage for a contig with aemb, but 0 reads aligned to it in extension mode, if the random placement never selected the contig.

Another possibility is that, since the extension alignment should be slightly more accurate, the extension mode finds the single best true location (which is not on the contig), while aemb thinks there are several best locations whit the contig being one of them.

Edit: I saw you updated the initial question. Yes, you are correct in that's a possibility.

ksahlin commented 2 months ago

You can set parameter -N to a positive value if you want strobealign to report more locations in default mode.