yangao07 / abPOA

abPOA: an SIMD-based C library for fast partial order alignment using adaptive band
MIT License
111 stars 18 forks source link

Unstable behavior in local alignment mode on x64 linux, possibly due to uninitialized values #50

Closed ctsa closed 11 months ago

ctsa commented 11 months ago

Hi @yangao07 ,

I'm using the latest binary release of abPOA 1.4..1 on x64 linux (Centos 7). I was getting unstable, non-deterministic results, especially seeing the error [simd_abpoa_align_sequence_to_subgraph] Error in cg_backtrack quite often for my use case, and I was able to trace this to an un-initialized memory error which seems to occur in local but not global alignment mode.

I reduced this down to a simple test case, using the following input:

test.fa:

>1
A
>2
A

And running:

valgrind ./abpoa -m1 test.fa

The top of the valgrind results show:

$ valgrind ./abpoa -m1 test.fa
==121699== Memcheck, a memory error detector
==121699== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==121699== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info   
==121699== Command: ./abpoa -m1 test.fa
==121699==
[main] CMD:  ./abpoa -m1 test.fa
==121699== Conditional jump or move depends on uninitialised value(s)
==121699==    at 0x13EC10: simd_abpoa_align_sequence_to_subgraph (in /home/csaunders/devel/github/abPOA/tmp/abPOA-v1.4.1_x64-linux/abpoa)
==121699==    by 0x141AB7: simd_abpoa_align_sequence_to_graph (in /home/csaunders/devel/github/abPOA/tmp/abPOA-v1.4.1_x64-linux/abpoa)
==121699==    by 0x10CE31: abpoa_poa (in /home/csaunders/devel/github/abPOA/tmp/abPOA-v1.4.1_x64-linux/abpoa)
==121699==    by 0x10DD4A: abpoa_msa1 (in /home/csaunders/devel/github/abPOA/tmp/abPOA-v1.4.1_x64-linux/abpoa)
==121699==    by 0x10A694: abpoa_main (in /home/csaunders/devel/github/abPOA/tmp/abPOA-v1.4.1_x64-linux/abpoa)
==121699==    by 0x10A008: main (in /home/csaunders/devel/github/abPOA/tmp/abPOA-v1.4.1_x64-linux/abpoa)

...followed by many similar memory errors that I assume are related. No issues appear for the same test in global mode.

Note also that the same error occurs for code I compiled from the head of the main branch (d2e0186963b5d8418dcc86f4154a4703fbce94dd).

Thanks for any help or suggestions you might have to stabilize this case!

yangao07 commented 11 months ago

Thanks for pointing this out! It is actually a bug related to initialization. Fixed it now, please try out the latest commit 6f6a06cd0.

ctsa commented 11 months ago

Nice thank you! I'll give it a try.

ctsa commented 11 months ago

Okay, all of my test cases are running cleanly through valgrind now. I'll close this as fixed.

Thanks again for the quick fix!