smarco / WFA2-lib

WFA-lib: Wavefront alignment algorithm library v2
Other
162 stars 36 forks source link

wfa2 crashes on ends-free alignment with the following options and the three sequences aligned one after another #86

Closed iAvicenna closed 7 months ago

iAvicenna commented 1 year ago

Hello,

I have been using wfa2 as a part of a script that analyses ngs data. I noticed that when I tweaked the parameters in a certain (but not unreasonable way) it crashes once in a while. I isolated it to three sequences, which when analysed one after the other causes the program to crash. I stripped down the whole thing into a bare minimum program which still crashes and is given below:

#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <stdbool.h>
#include "../ext/WFA2/wavefront/wfa.h"

int main(){

  int i;

  char *ref = "GCTCAGGATGGGAAAAGCTATGCTTGCAAAAGGGGATCTGTTAACAGTTTCTTTAGTAGATTGAATTGGTTGCACAAATTAGAATACAAATATCCAGCGCTGAACGTGACTATGCCAAACAATGGCAAATTTGACAAATTGTACATTTGGGGGGTTCACCACCCGAGCACGGACAGTGACCAAACCAGCATATATGTTCGAGCATCAGGGAGAGTC";
  char *seqs[3] = 
  {
  "CTACGTAGAGCTCAGGATGGCGCTCAGGAAAAGCTATGCTTGCCTTAGGGGATCTGTTTTGAGTTTCTTTAGTAGATTGAATTGAATTGGTTGCACAAATTAGAATACAAATATCCAGCGCTGAACGTGACTATGCCAAACAATGGCAAATTTGACAAATTGTACATTTGGGGGGTTCACCACCCGAGCACGGACAGTGACCAAACCAGCATATATGTTCGAGCATCAGGAGAGTCGCTACGTAGAGATCGGAAGAG",
  "TCTTCCGACTACAGGCTCAGGATGGGAATGGGAAAAGCTATGCTTGCAAAAGGGGATCTGTTAGTAGTTTCTTTAGTAGATTGAATTGGTTGCACAAATTAGAATACAAATATCCAGCGCTGAACGTGACTATGCCAAACAATGGCAAATTTGACAAATTGTACATTTGGGGGGTTCACCACCCGAGCACGGACAGTGACCAAACCAGCATATATGTTCGAGCATCAGGGAGAGTCCTGTAGTCGAGATCGGAAGA",
  "ACGAGAGATACGCTCAGGATGGGATCAGGATGGGAAAAGCTATGCTTGCAAAAGGGGATCTGTTAACAGTTTCTTTAGTAGATTGAATTGGTTGCACAAATTAGAATACAAATATCCAGCGCTGAACGTGACTATGCCAAACAATGGCAAATTTGACAAATTGTACATTTGGGGGGTTCACCACCCGAGCACGGACAGTGACCAAACCAGCATATATGTTCGAGCATCAGGGAGATCGCTACGTAGAGATCGGAAG"
  };

  wavefront_aligner_t* wf_aligner;
  wavefront_aligner_attr_t attributes = wavefront_aligner_attr_default;
  attributes.distance_metric = gap_affine_2p;
  attributes.affine2p_penalties.mismatch = 1;
  attributes.affine2p_penalties.match = -1;
  attributes.affine2p_penalties.gap_opening1 = 4;
  attributes.affine2p_penalties.gap_extension1 = 4;
  attributes.affine2p_penalties.gap_opening2 = 4;
  attributes.affine2p_penalties.gap_extension2 = 4;

  attributes.alignment_form.span = alignment_endsfree;
  wf_aligner = wavefront_aligner_new(&attributes);
  wavefront_aligner_set_alignment_free_ends(wf_aligner, 12, 12, 6, 6);    

  for (i=0; i<3; i++){

      wavefront_align(wf_aligner, seqs[i], strlen(seqs[i]), ref, strlen(ref));

  }

  wavefront_aligner_delete(wf_aligner);                                          

}

I ran this with valgrind and the part of the log that pertains to the error is (there is a bunch of stuff that appears because wf_aligner was not freed due to a crash):

==31428== Memcheck, a memory error detector
==31428== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==31428== Using Valgrind-3.21.0.GIT and LibVEX; rerun with -h for copyright info
==31428== Command: /home/avicenna/Dropbox/local_packages/merlign/src//ign_debug
==31428== 
==31428== Invalid read of size 8
==31428==    at 0x127214: wavefront_extend_matches_packed_kernel (wavefront_extend.c:182)
==31428==    by 0x127214: wavefront_extend_matches_packed_endsfree (wavefront_extend.c:253)
==31428==    by 0x1277AA: wavefront_extend_endsfree (wavefront_extend.c:435)
==31428==    by 0x12BE70: wavefront_unialign (wavefront_unialign.c:424)
==31428==    by 0x11C6A8: wavefront_align_unidirectional (wavefront_align.c:123)
==31428==    by 0x11C6A8: wavefront_align (wavefront_align.c:170)
==31428==    by 0x11C27C: main (ign_debug.c:36)
==31428==  Address 0xffffffffc56f635b is not stack'd, malloc'd or (recently) free'd
==31428== 
==31428== 
==31428== Process terminating with default action of signal 11 (SIGSEGV)
==31428==  Access not within mapped region at address 0xFFFFFFFFC56F635B
==31428==    at 0x127214: wavefront_extend_matches_packed_kernel (wavefront_extend.c:182)
==31428==    by 0x127214: wavefront_extend_matches_packed_endsfree (wavefront_extend.c:253)
==31428==    by 0x1277AA: wavefront_extend_endsfree (wavefront_extend.c:435)
==31428==    by 0x12BE70: wavefront_unialign (wavefront_unialign.c:424)
==31428==    by 0x11C6A8: wavefront_align_unidirectional (wavefront_align.c:123)
==31428==    by 0x11C6A8: wavefront_align (wavefront_align.c:170)
==31428==    by 0x11C27C: main (ign_debug.c:36)
==31428==  If you believe this happened as a result of a stack
==31428==  overflow in your program's main thread (unlikely but
==31428==  possible), you can try to increase the size of the
==31428==  main thread stack using the --main-stacksize= flag.
==31428==  The main thread stack size used in this run was 8388608.
==31428== 
==31428== HEAP SUMMARY:
==31428==     in use at exit: 8,645,464 bytes in 20 blocks
==31428==   total heap usage: 28 allocs, 8 frees, 8,671,936 bytes allocated

I also ran this with gdb to see exactly where it crashes and it gives the fault here

Program received signal SIGSEGV, Segmentation fault.
0x0000555555573214 in wavefront_extend_matches_packed_kernel (offset=-1073741823, k=0, wf_aligner=0x7ffff761d020) at wavefront_extend.c:253
253         offset = wavefront_extend_matches_packed_kernel(wf_aligner,k,offset);

printing the offset value produces this:

$1 = -1073741823

Note that I have tested valgrind with other input (for instance only the first two sequences above) and it produces a clean log with such examples (note that for valgrind I had to compile wfa2 with MARCH_FLAG="" BUILD_WFA_PARALLEL="0", otherwise it produces some warnings specific to these).

Further info:

Let me know if there is any other diagnostics I can produce, this is as far as I could go with gbp and valgrind since I don't know the internal details of wfa2. I wonder if there is a function to reset the internal state of a wavefront_aligner_t object to its default state without deleting which would perhaps temporarily remedy the problem until a patch?

Thanks

smarco commented 1 year ago

Hi,

I am busy these days, but I will fix this bug as soon as possible.

Cheers,

smarco commented 8 months ago

Hi,

I apologize for the terrible delay in answering this issue. I am sorry, but, using the latest version (dev), I couldn't reproduce the memory problem. I try the same compiler and OS.

What is the specific CPU ISA you are using (x86, ARM, ...)? Let us try to reproduce this problem and fix it.

smarco commented 8 months ago

May be related to issue #87? Have a look a the latest commit to development, (commit https://github.com/smarco/WFA2-lib/commit/1d970397af1496a6dba722cfa705b18e2fcf4bd0).