smarco / WFA2-lib

WFA-lib: Wavefront alignment algorithm library v2
Other
162 stars 36 forks source link

Unexpected alignment failures #21

Closed h-2 closed 2 years ago

h-2 commented 2 years ago

I have integrated WFA2-lib into an app, and I keep getting difficult to track alignment failures. After some successful alignments, wavefront_align() will just return -1. The exact point where this happens, depends on whether I build in debug or release mode and whether I turn on sanitizers or not. Release mode produces many hundred correct alignments, debug fails after a couple and with ASAN, it fails immediately.

I suspect there is undefined behaviour somewhere in WFA2-lib.

Here is a minimal example:

extern "C" {
#include <wavefront/wavefront_align.h>
}

#include <string>

void do_align(std::string_view const pattern, std::string_view const ref)
{
  // Configure alignment attributes
  wavefront_aligner_attr_t attributes = wavefront_aligner_attr_default;
  attributes.alignment_scope = compute_alignment;
  attributes.distance_metric = gap_affine;
  attributes.affine_penalties.mismatch = 3;
  attributes.affine_penalties.gap_opening = 5;
  attributes.affine_penalties.gap_extension = 1;
  attributes.alignment_form.span = alignment_endsfree;
  attributes.alignment_form.pattern_begin_free = 0;
  attributes.alignment_form.pattern_end_free = 0;
  attributes.alignment_form.text_begin_free = 1;
  attributes.alignment_form.text_end_free = 1;
  attributes.heuristic.strategy = wf_heuristic_wfadaptive;
  attributes.heuristic.min_wavefront_length = 10;
  attributes.heuristic.max_distance_threshold = 10;
  attributes.heuristic.steps_between_cutoffs = 1;

  wavefront_aligner_t * const wf_aligner = wavefront_aligner_new(&attributes);

  int res = wavefront_align(wf_aligner, pattern.data(), pattern.size(), ref.data(), ref.size());

  cigar_print_pretty(stderr, pattern.data(), pattern.size(), ref.data(), ref.size(),
                     &wf_aligner->cigar, wf_aligner->mm_allocator);
  fprintf(stderr,"Alignment Score %d\nResult:%d\n", wf_aligner->cigar.score, res);

  assert(res != -1);
  wavefront_aligner_delete(wf_aligner);
}

int main()
{
    do_align("GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGTCAGGAGCCGAGCCGACGAGGTGGTGATGTTGGTCGGGCGTGATCCGGGGTGGCGTGACGAGGATGGCGGGGTGGTAGCGGGGGGGGGGGGGGGCGGGCGGGCGGGGGGGGGGGGCGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG",
             "GGGGGGGGGGGGATTATACAGCAAATTTACTTAAAAATGTGTATTAGTCAGATTTTTAGTTACTCATGGGTAAATGCAATCCCTAATTAAGGGTGTGAAGTGAGTGCTGAAACTTGCTTAGGAAAAGAGGTGGAAAAATTGGATGGGAATTAAGCATAGAGGTACCACGAAGTATCTGAAATTGTTTGGTTATGTCTGTAGACAAATCAAATGCTTAAACAAAATAAACTGAAATTTTCAACACATGCACACACACAGTCCTCATACTTTTAGATTTTTAGTTTAAAAAATAAGT");
}

When building with these options: g++ wfabug.cpp -I ~/devel/WFA2-lib ~/devel/WFA2-lib/lib/libwfa.a it prints:

      ALIGNMENT 12M1X25I1M1X1M48D26I3M20I3M1X1M1X1M2X1M1X1M1X1M1X1M33D1M1X1M1X1M4I2M2X2M3X1M1X7M2X1M2X4M2X3M2X1M1X1M2X1M3X1M1X2M4X4M1X1M8I2M1X1M3X3M2X1M16D1M1X1M1X1M32I1M37D41I1M43D20I1M1I
      ALIGNMENT.COMPACT 1X25I1X48D26I20I1X1X2X1X1X1X33D1X1X4I2X3X1X2X2X2X2X1X2X3X1X4X1X8I1X3X2X16D1X1X32I37D41I43D20I1I
      PATTERN    GGGGGGGGGGGGG-------------------------GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG--------------------------GGG--------------------GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGTCAGGA----GCCGAGCCGACGAGGTGGTGATGTTGGTCGGGCGTGATCCGGGGTGGCGTGACGAGG--------ATGGCGGGGTGGTAGCGGGGGGGGGGGGGGGCGG--------------------------------GCGGGCGGGGGGGGGGGGCGGGGGGGGGGGGGGGGGGG-----------------------------------------GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG--------------------G-
                 ||||||||||||                          | |                                                                          |||                    ||| | |  | | | |                                 | | |    ||  ||   | |||||||  |  ||||  |||  | |  |   | ||    |||| |        || |   |||  |                | | |                                |                                                                              |                                                               | 
      TEXT       GGGGGGGGGGGGATTATACAGCAAATTTACTTAAAAATGTG------------------------------------------------TATTAGTCAGATTTTTAGTTACTCATGGGTAAATGCAATCCCTAATTAAGGGTGTGAAGTGAGTG---------------------------------CTGAAACTTGCTTAGGAAAAGAGGTGGAAAAATTGGATGGGAATTAAGCATAGAGGTACCACGAAGTATCTGAAATTGTTTGGTTAT----------------GTCTGTAGACAAATCAAATGCTTAAACAAAATAAACTG-------------------------------------AAATTTTCAACACATGCACACACACAGTCCTCATACTTTTAG-------------------------------------------ATTTTTAGTTTAAAAAATAAGT
Alignment Score -553
Result:0

When building with the address sanitizer: g++ -fsanitize=address wfabug.cpp -I ~/devel/WFA2-lib ~/devel/WFA2-lib/lib/libwfa.a it prints:

      ALIGNMENT
      ALIGNMENT.COMPACT
      PATTERN    GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGTCAGGAGCCGAGCCGACGAGGTGGTGATGTTGGTCGGGCGTGATCCGGGGTGGCGTGACGAGGATGGCGGGGTGGTAGCGGGGGGGGGGGGGGGCGGGCGGGCGGGGGGGGGGGGCGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG
                 ???????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????
      TEXT       GGGGGGGGGGGGATTATACAGCAAATTTACTTAAAAATGTGTATTAGTCAGATTTTTAGTTACTCATGGGTAAATGCAATCCCTAATTAAGGGTGTGAAGTGAGTGCTGAAACTTGCTTAGGAAAAGAGGTGGAAAAATTGGATGGGAATTAAGCATAGAGGTACCACGAAGTATCTGAAATTGTTTGGTTATGTCTGTAGACAAATCAAATGCTTAAACAAAATAAACTGAAATTTTCAACACATGCACACACACAGTCCTCATACTTTTAGATTTTTAGTTTAAAAAATAAGT
Alignment Score -2147483648
Result:-1
a.out: wfabug.cpp:36: void do_align(std::string_view, std::string_view): Assertion `res != -1' failed.
[1]    673339 IOT instruction (core dumped)  ./a.out
smarco commented 2 years ago

Hi,

I have tried your example with the current master and it seems to work fine (in both compilation cases). Can you try again and let me know? (I'm sure we can solve this one real quick).

h-2 commented 2 years ago

I can still reproduce my problem with the main branch (commit d933a102c4eb4cc4d01c2a116a13f9969c531873 ).

I have tried with gcc-9 and gcc-11. I haven't been able to get this to build with clang, but I only tried for a couple of minutes.

Is this maybe related to the problems people have been reporting in #16 ?

smarco commented 2 years ago

I tried with a clean VM (Ubuntu 20), g++ 9.0, and I couldn't still reproduce the problem. Would you mind sharing more details of your environment/setup so I can reproduce it?

h-2 commented 2 years ago

Sorry, I must have done something very weird with my local git branch. After getting a fresh clone, it now seems to work.

Thanks a lot and sorry for causing extra work!

smarco commented 2 years ago

No worries. I must admit that I feel relieved that there is no weird bug no more :-) Thanks