smarco / WFA2-lib

WFA-lib: Wavefront alignment algorithm library v2
Other
162 stars 36 forks source link

Uninitialized offset can lead to client segfault #84

Closed ctsa closed 1 year ago

ctsa commented 1 year ago

Hi @smarco -

Thanks again for WFA2 and your prior help with a similar issue -- I have a new problem where I'm getting non-deterministic segfaults in client code. I traced this down to one issue in wfa2 which is likely to be the cause, a Conditional jump or move depends on uninitialised value here:

https://github.com/smarco/WFA2-lib/blob/9fdf46efd95bcac751adae0a57b44c2e80e6eef7/wavefront/wavefront_extend_kernels.c#L143

I reduced the problem case to the following code:

#include <stdbool.h>
#include <stdint.h>
#include <stdio.h>
#include <string.h>
#include <time.h>
#include "wavefront/wavefront_align.h"

int main() {
    char pattern[] = "TAT";
    char text[]    = "CAT";

    wavefront_aligner_attr_t attr = wavefront_aligner_attr_default;
    attr.distance_metric = gap_linear;
    attr.linear_penalties.match = -1;
    attr.linear_penalties.mismatch = 2;
    attr.linear_penalties.indel = 2;
    attr.alignment_scope = compute_alignment;

    attr.alignment_form.span = alignment_endsfree;
    attr.alignment_form.pattern_begin_free = 1;
    attr.alignment_form.pattern_end_free = 1;
    attr.alignment_form.text_begin_free = 1;
    attr.alignment_form.text_end_free = 1;

    wavefront_aligner_t* const wf_aligner = wavefront_aligner_new(&attr);
    wavefront_align(wf_aligner, pattern, strlen(pattern), text, strlen(text));
    cigar_print_pretty(stderr,
      wf_aligner->cigar,pattern,strlen(pattern),text,strlen(text));

    fprintf(stderr,"Alignment Score %d\n",wf_aligner->cigar->score);

    wavefront_aligner_delete(wf_aligner);
}

After building this in linux debug mode against the library version currently on main (9fdf46e), the valgrind output is:

==44021== Memcheck, a memory error detector
==44021== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==44021== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info
==44021== Command: ./a.out
==44021==
==44021== Conditional jump or move depends on uninitialised value(s)
==44021==    at 0x41242B: wavefront_extend_matches_packed_endsfree (wavefront_extend_kernels.c:143)
==44021==    by 0x40CE7D: wavefront_extend_endsfree_dispatcher_seq (wavefront_extend.c:229)
==44021==    by 0x40CF02: wavefront_extend_endsfree_dispatcher_threads (wavefront_extend.c:246)
==44021==    by 0x40CFC5: wavefront_extend_endsfree (wavefront_extend.c:282)
==44021==    by 0x411999: wavefront_unialign (wavefront_unialign.c:251)
==44021==    by 0x401641: wavefront_align_unidirectional (wavefront_align.c:132)
==44021==    by 0x40192E: wavefront_align (wavefront_align.c:228)
==44021==    by 0x40129E: main (test.c:26)
==44021==
==44021== Use of uninitialised value of size 8
==44021==    at 0x412474: wavefront_extend_matches_kernel_blockwise (wavefront_extend_kernels.c:72)
==44021==    by 0x412474: wavefront_extend_matches_packed_endsfree (wavefront_extend_kernels.c:145)
==44021==    by 0x40CE7D: wavefront_extend_endsfree_dispatcher_seq (wavefront_extend.c:229)
==44021==    by 0x40CF02: wavefront_extend_endsfree_dispatcher_threads (wavefront_extend.c:246)
==44021==    by 0x40CFC5: wavefront_extend_endsfree (wavefront_extend.c:282)
==44021==    by 0x411999: wavefront_unialign (wavefront_unialign.c:251)
==44021==    by 0x401641: wavefront_align_unidirectional (wavefront_align.c:132)
==44021==    by 0x40192E: wavefront_align (wavefront_align.c:228)
==44021==    by 0x40129E: main (test.c:26)
==44021==
==44021== Use of uninitialised value of size 8
==44021==    at 0x41247B: wavefront_extend_matches_kernel_blockwise (wavefront_extend_kernels.c:72)
==44021==    by 0x41247B: wavefront_extend_matches_packed_endsfree (wavefront_extend_kernels.c:145)
==44021==    by 0x40CE7D: wavefront_extend_endsfree_dispatcher_seq (wavefront_extend.c:229)
==44021==    by 0x40CF02: wavefront_extend_endsfree_dispatcher_threads (wavefront_extend.c:246)
==44021==    by 0x40CFC5: wavefront_extend_endsfree (wavefront_extend.c:282)
==44021==    by 0x411999: wavefront_unialign (wavefront_unialign.c:251)
==44021==    by 0x401641: wavefront_align_unidirectional (wavefront_align.c:132)
==44021==    by 0x40192E: wavefront_align (wavefront_align.c:228)
==44021==    by 0x40129E: main (test.c:26)
==44021==
==44021== Conditional jump or move depends on uninitialised value(s)
==44021==    at 0x4120A9: wavefront_termination_endsfree (wavefront_termination.c:128)
==44021==    by 0x412516: wavefront_extend_matches_packed_endsfree (wavefront_extend_kernels.c:148)
==44021==    by 0x40CE7D: wavefront_extend_endsfree_dispatcher_seq (wavefront_extend.c:229)
==44021==    by 0x40CF02: wavefront_extend_endsfree_dispatcher_threads (wavefront_extend.c:246)
==44021==    by 0x40CFC5: wavefront_extend_endsfree (wavefront_extend.c:282)
==44021==    by 0x411999: wavefront_unialign (wavefront_unialign.c:251)
==44021==    by 0x401641: wavefront_align_unidirectional (wavefront_align.c:132)
==44021==    by 0x40192E: wavefront_align (wavefront_align.c:228)
==44021==    by 0x40129E: main (test.c:26)
==44021==
==44021== Conditional jump or move depends on uninitialised value(s)
==44021==    at 0x4120FD: wavefront_termination_endsfree (wavefront_termination.c:144)
==44021==    by 0x412516: wavefront_extend_matches_packed_endsfree (wavefront_extend_kernels.c:148)
==44021==    by 0x40CE7D: wavefront_extend_endsfree_dispatcher_seq (wavefront_extend.c:229)
==44021==    by 0x40CF02: wavefront_extend_endsfree_dispatcher_threads (wavefront_extend.c:246)
==44021==    by 0x40CFC5: wavefront_extend_endsfree (wavefront_extend.c:282)
==44021==    by 0x411999: wavefront_unialign (wavefront_unialign.c:251)
==44021==    by 0x401641: wavefront_align_unidirectional (wavefront_align.c:132)
==44021==    by 0x40192E: wavefront_align (wavefront_align.c:228)
==44021==    by 0x40129E: main (test.c:26)
==44021==
      ALIGNMENT 1X2M
      ETRACE    1X
      CIGAR     1X2M
      PATTERN    TAT
                  ||
      TEXT       CAT
Alignment Score 0
==44021==
==44021== HEAP SUMMARY:
==44021==     in use at exit: 0 bytes in 0 blocks
==44021==   total heap usage: 23 allocs, 23 frees, 4,297,004 bytes allocated
==44021==
==44021== All heap blocks were freed -- no leaks are possible
==44021==
==44021== Use --track-origins=yes to see where uninitialised values come from
==44021== For lists of detected and suppressed errors, rerun with: -s
==44021== ERROR SUMMARY: 15 errors from 5 contexts (suppressed: 0 from 0)

I tried to pursue this a bit further, and could at least figure out that the initialization problem seems to be specific to wavefront 2.

If you're able to reproduce and fix (or explain a workaround for) this case it would be very helpful. Thank you!

smarco commented 1 year ago

Ok, thanks for the report. That was a really interesting one.

TBH, the code allowing match!=0 and ends-free is not ideal. This will improve greatly with the next major release (in speed and code robustness). Until then, I patched it.

Let me know.

ctsa commented 1 year ago

Great, thank you! I confirm the issue appears to be fixed on my case.