Closed nsmlzl closed 1 year ago
BTW, this PR generates the same zeta array as the original implementation. However, the logic of the original implementation seems slightly off:
The non-quantized zetas (N <= space_max
) are fine. But for the quantized zetas, I would expect to store the smallest value it approximates. Instead, we store the previous zeta.
Here an example for space_max=10
and space_quantization_step=5
(zetas[n]
represents the elements of the zetas vector, while zeta(x)
represents the zeta function):
zetas[1] = zeta(1)
...
zetas[8] = zeta(8)
zetas[9] = zeta(9)
zetas[10] = zeta(10)
zetas[11] = zeta(10) # quantized: used as approximation for zeta(11) to zeta(15)
zetas[12] = zeta(15) # quantized: used as approximation for zeta(16) to zeta(20)
...
So I would expect zetas[11] to store the value of zeta(11), and for zetas[12] to store the value of zeta(16), since those are the corresponding smallest values in the range that the zetas elements approximate.
In practice, however, this will probably not make much of a difference.
Hot!
We preprocess the zeta values in the sort and the layout subcommands:
Previously we repeated this computation for each value of N (1 to x). However, if we examine the zeta function, we can see that we could actually compute all zeta values in a single computation (theta is constant), since the computation of zeta(N) includes the computation of zeta(N-1).
This PR combines the computation of all zetas into a single computation. For N less than or equal to
space_max
, we just store the zeta values directly. For larger N we respectspace_quantization_step
. For my configuration it generated exactly the same values as the previous implementation; you can check this for your configuration as well, just undo the cleanup commits, then you will have the code to compare the generated values.For the layout subcommand, this results in a significant speedup for large pangenomes (e.g., Chr6 from ~230s down to ~40ms). The default configuration of odgi sort seems to generate only a small zeta vector (
space_quantization_step
seems to be huge). With this speedup it might make sense to increase it again.