ocxtal / minialign

[IMPORTANT: not for real data analysis, only for algorithm evaluation] fast and accurate alignment tool for PacBio and Nanopore long reads
MIT License
126 stars 9 forks source link

segfault on certain combination of gapopen and gapextension penalties #7

Open svm-zhang opened 7 years ago

svm-zhang commented 7 years ago

Hello Hajime,

First of all, thanks for developing minialign!

I was trying to map some simulated pacbio reads (from Pbsim) against both GRCH38 and hg19 references. Besides the default setting of the scoring scheme, I was trying to explore effects of other settings (like ones built as default in other long reads mappers, e.g NGM-LR, GraphMap, etc).

I encountered this segfault using the following combination of -p and -q:

-p5 -q3 -p4 -q4 -p5 -q4

If I used -p4 -q2, -p5 -q2, and -p4 -q3, minialign worked just like magic. Although some of these settings might not make any practical sense, I figured you might want to know this problem (that's why I tried different combinations). If you looked at htop, the runs with segfault directly went to "sleep" status after building/loading the index.

Hope this makes sense and somewhat helpful to you.

Simo

ocxtal commented 7 years ago

Hi, Simo

Thanks a lot for your kind issue report. The problem reproduced on my environment and seems to be related to the (my) known one: https://github.com/ocxtal/minialign/issues/2 , which is caused by a variable overflow error in the dynamic programming (Smith-Waterman) routine.

Unfortunately, I've not yet figured out how to fix the problem since some ad-hoc solutions are confirmed to seriously degrade its performance (computational speed). So I've added a notification on the potential SEGV for such a set of parameters as a temporary fix.

Thank you very much and best regards,

Hajime Suzuki

svm-zhang commented 7 years ago

Hello Hajime,

Glad to hear this is a known problem and glad to help.

For now, I will stick to the default and -a1 -b1 -p1 -q1 scheme. Please let me know if you come up with some solutions :)

Simo