thegenemyers / DALIGNER

Find all significant local alignments between reads
Other
138 stars 61 forks source link

LAsort: segfault in old code #93

Closed a-ludi closed 4 years ago

a-ludi commented 4 years ago

I get another segfault in LAsort. This time it is in line 326 of LAsort.c. I attached the file that causes the error: reads-simulated-pb.9.assembly-reference.tar.gz

thegenemyers commented 4 years ago

I am unable to reproduce a bug on my laptop with the data set provided. Could you compile with -g, run lldb or gdb and send me a trace? -- Gene

On 5/25/20, 11:28 AM, Arne wrote:

I get another segfault in |LAsort|. This time it is in line 326 of |LAsort.c| https://github.com/thegenemyers/DALIGNER/blame/477d5b92459c7e22baf2e15af712b70ee54c838b/LAsort.c#L326. I attached the file that causes the error: reads-simulated-pb.9.assembly-reference.tar.gz https://github.com/thegenemyers/DALIGNER/files/4676388/reads-simulated-pb.9.assembly-reference.tar.gz

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/thegenemyers/DALIGNER/issues/93, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABUSINVKSHY4E7FOVOYRVD3RTI237ANCNFSM4NJL3NJA.

a-ludi commented 4 years ago

The stack has just a single frame:

#0  0x0000000000401f5f in main (argc=2, argv=0x7fffffffc738) at LAsort.c:326

I checked that line. Actually, the pointer ((Overlap *) (iblock+off)) is invalid because off == -801484786. The base address iblock is accessible but looks like rubish:

{path = {trace = 0x9dd000001c2, tlen = 0, diffs = 804635, abpos = 22409, bbpos = 825552, aepos = 20, bepos = 59931}, flags = 9498, aread = 0, bread = 1511021325}

Just in case it might help, here's a dump of all local variables:

off = -801484786
j = 8571
perm = 0x6156f0
foutput = 0x6154b0
parse = 0x615030
input = 0x615170
novl = 10256
sov = 8278
iblock = 0x7fffbb9e4018 {path = {trace = 0x9dd000001c2, tlen = 0, diffs = 804635, abpos = 22409, bbpos = 825552, aepos = 20, bepos = 59931}, flags = 9498, aread = 0, bread = 1511021325}
fblock = 0x7fffbbd69010 {path = {trace = 0x0, tlen = 0, diffs = 0, abpos = 0, bbpos = 0, aepos = 0, bepos = 0}, flags = 0, aread = 0, bread = 0}
iend = 0x7fffbbd68004 {path = {trace = 0x5b0b5d0c5d075c10, tlen = 0, diffs = 0, abpos = 0, bbpos = 0, aepos = 0, bepos = 0}, flags = 0, aread = 0, bread = 0}
isize = 3686400
osize = 1000000000
ovlsize = 40
ptrsize = 8
tspace = 100
tbytes = 1
i = 1
VERBOSE = 0
MAP_ORDER = 0
thegenemyers commented 4 years ago

So the only thing I can think of is that you generated the .las on a big-endian machine and are now trying to sort it on a little-endian machine, or vice versa. Basically the loop is simply going through the file advancing off one overlap record at a time. The trace vector is variable in length so it looks this up to see how far to advance "off". So that look up is returning garbage. Got to be an endian problem, please advise. -- Gene

On 6/9/20, 5:56 PM, Arne wrote:

The stack has just a single frame:

#0 0x0000000000401f5f in main (argc=2, argv=0x7fffffffc738) at LAsort.c:326

I checked that line. Actually, the pointer |((Overlap *) (iblock+off))| is invalid because |off == -801484786|. The base address |iblock| is accessible but looks like rubish:

{path = {trace = 0x9dd000001c2, tlen = 0, diffs = 804635, abpos = 22409, bbpos = 825552, aepos = 20, bepos = 59931}, flags = 9498, aread = 0, bread = 1511021325}

Just in case it might help, here's a dump of all local variables:

off = -801484786 j = 8571 perm = 0x6156f0 foutput = 0x6154b0 parse = 0x615030 input = 0x615170 novl = 10256 sov = 8278 iblock = 0x7fffbb9e4018 {path = {trace = 0x9dd000001c2, tlen = 0, diffs = 804635, abpos = 22409, bbpos = 825552, aepos = 20, bepos = 59931}, flags = 9498, aread = 0, bread = 1511021325} fblock = 0x7fffbbd69010 {path = {trace = 0x0, tlen = 0, diffs = 0, abpos = 0, bbpos = 0, aepos = 0, bepos = 0}, flags = 0, aread = 0, bread = 0} iend = 0x7fffbbd68004 {path = {trace = 0x5b0b5d0c5d075c10, tlen = 0, diffs = 0, abpos = 0, bbpos = 0, aepos = 0, bepos = 0}, flags = 0, aread = 0, bread = 0} isize = 3686400 osize = 1000000000 ovlsize = 40 ptrsize = 8 tspace = 100 tbytes = 1 i = 1 VERBOSE = 0 MAP_ORDER = 0

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/thegenemyers/DALIGNER/issues/93#issuecomment-641397780, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABUSINS7AUFNXAA3MSIVUTDRVZLTVANCNFSM4NJL3NJA.

a-ludi commented 4 years ago

It looks like the LAS file is garbage. Everything works just fine until j = 8570. Then the overlap is corrupted and off gets a negative value assigned because tlen is very small:

print *((Overlap *) (iblock+off))
$12 = {path = {trace = 0x0, tlen = -805175296, diffs = 15258, abpos = 0, bbpos = 0, aepos = 0, bepos = 0}, flags = 0, aread = 0, bread = 0}

Unfortunately, I don't know the exact command I used to create the files anymore. I must have been damapper likely with -C -e0.841500 -T8 -M24. So, my conclusion for now is that it's a rare error of damapper.

a-ludi commented 4 years ago

PS: I wonder, though, why you could not reproduce the bug if it's the file...

thegenemyers commented 4 years ago

Yes I don't understand this either. LAsort runs fine and produces what looks like a good result, albeit I cannot check further without the sequence DB. -- Gene

On 6/10/20, 5:45 PM, Arne wrote:

PS: I wonder, though, why you could not reproduce the bug if it's the file...

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/thegenemyers/DALIGNER/issues/93#issuecomment-642094808, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABUSINXANYEMIEYIZL2YCLLRV6TATANCNFSM4NJL3NJA.