Closed RagnarGrootKoerkamp closed 1 year ago
Thanks for the feedback and sorry for the late reply. You are referring to the paper "Accelerating Edit-Distance Sequence Alignment on GPU Using the Wavefront Algorithm", which has the code here: https://github.com/quim0/eWFA-GPU
Sorry for the confusing naming, can you open the issue on that repo?
In table 1, you compare some iCPU mplementations, but it is not quite clear to me exactly which tools you use.
Edlib
Edlib implements an $O(nd)$ algorithm based on the exponential search in Ukkonen'85. This part is clear :smile:
$O(ND)$
In the background section you mention DAligner, but my limited understanding is that this is a mapper, not a global aligner. In the experimental setup you only link [37], Myers'86 original paper, but that does not come with an implementation itself.
I'm assuming this is some implementation of Myers'86 for edit distance (as opposed to LCS, which is what Myers'86 is originally about). That means it has $O(nd)$ worst case complexity (as the name indicates) and $O(d^2)$ expected complexity (as mentioned in section 4.1 of Myers'86).
BPM
From the background section, I think this is an abbreviation of Myers bit vector algorithm. You list two such tools, Edlib and BGSA. Does this mean that BPM in the table refers to BGSA?
BGSA seems to be a score-only tool that implements the classic $O(n^2)$ NW algorithm in parallel via SIMD for multiple alignments tasks. This way, it achieves a 30x speedup over edlib in their paper. I don't see this in your table, so maybe my assumption that you are using BGSA here is wrong?
Table 1 evaluation
In section IV.C, you write:
Edlib has an $O(nd)$ runtime, and in fact, I am quite surprised to not see a dependency of the runtime on the error rate. The dependency may be hidden by the (IO) overhead of dealing with very short sequences. What are your thoughts about this?
Similarly, $O(ND)$ should have an $O(d^2)$ expected runtime, but again, barely any dependency on $e$ is present in the runtime. Again, this may imply that the measured times are mostly IO overhead. In fact, for length 150 and 300 higher error rates give faster runtimes?!
Assuming that BPM is an $O(n^2)$ algorithm, there the lack of dependency on $e$ indeed makes sense.
You continue:
While it is true that Edlib supports banded alignment, at its core it is an exact algorithm. Your experimental setup does not mention the explicit use of banding in Edlib, so assuming you use it with default parameters, it should return an exact result.
Thus, the much (2.5x) slower datapoint at length=1000, e=2% is quite surprising to me. In case you do indeed use some form of banding, could you explain how this could lead to faster runtimes for higher error rates, and why this effect does not happen for shorter sequences? Are you sure this is not some kind of measurement error?