smarco / BiWFA-paper

Bidirectional WFA (Paper)
Other
41 stars 3 forks source link

Alignment ranges #5

Open rmhubley opened 2 years ago

rmhubley commented 2 years ago

In the WFALM library there is an API that produces a seeded local alignment ( ends-free extension ) specified using two anchor points:

aligner_sw.wavefront_align_local_low_mem(local_seq1.c_str(), local_seq1.size(),
                                                               local_seq2.c_str(), local_seq2.size(),
                                                               anchor_begin_1, anchor_end_1,
                                                               anchor_begin_2, anchor_end_2,
                                                               false);

I assume, but haven't verified that this is roughly equivalent to the following in the BiWFA/WFA2 API:

    // Right extension from anchor ends
    attributes.alignment_form.span = alignment_endsfree;
    attributes.alignment_form.pattern_begin_free = anchor_end_1;
    attributes.alignment_form.pattern_end_free = strlen(seq1);
    attributes.alignment_form.text_begin_free = anchor_end_2;
    attributes.alignment_form.text_end_free = strlen(seq2);

    // Followed by Left extension from anchor starts
    attributes.alignment_form.span = alignment_endsfree;
    attributes.alignment_form.pattern_begin_free = anchor_start_1;
    attributes.alignment_form.pattern_end_free = 0;
    attributes.alignment_form.text_begin_free = anchor_start_2;
    attributes.alignment_form.text_end_free = 0

The use of begin/end_free are a bit confusing in your left/right extension example as they pertain to this use case. Specification aside, are these roughly equivalent operations between the libraries? Would it be possible to request that in future releases you include the alignment ranges in the output as was done in wfalm? It could be gleaned from the CIGAR but that is less efficient to process and in many cases the alignment details may not even be needed, but the range of the extension is.

RagnarGrootKoerkamp commented 2 years ago

I got the impression that the BiWFA code was forked before some features were added to WFA2? I was also missing the --algorithm flag. Would be nice to see them merged again :)

smarco commented 2 years ago

The use of begin/end_free are a bit confusing in your left/right extension example as they pertain to this use case.

I agree. The functionality you are looking for is of a higher level, combining two alignment extensions (i.e., left and right).

Specification aside, are these roughly equivalent operations between the libraries?

Note that @jeizenga's wfalm does something a little bit more sophisticated than just ends-free/extension alignment. He also determines the optimal extension point to stop. @jeizenga, correct me if I'm wrong here.

Would it be possible to request that in future releases you include the alignment ranges in the output as was done in wfalm?

Sure, if it is useful, I can try to borrow @jeizenga's time to see how to implement the API and method best.

Would be nice to see them merged again :)

Sure, we have already integrated the BiWFA into the development branch of WFA2-lib under the memory-mode 'ultralow'. We are still polishing the details, but feel free to use it.

jeizenga commented 2 years ago

Yes, that's correct. I think the technique in wfalm is probably most useful for read mapping applications, where you expect to have nearly full-length alignments, minus some soft-clipping. For a more general extension alignment, it could end up being very costly to guarantee an optimal extension, and I expect an X-drop heuristic would be more appropriate.