Like SDSL, simple-sds used to create 1 << log2(m) buckets for the high parts of the values in SparseVector. This often created unnecessary buckets for positions that did not exist in the bitvector. After this PR, there will only be buckets for positions 0 (inclusive) to n (exclusive), which may reduce the size of the structure significantly.
The width of the low parts also used a poor approximation inherited from SDSL: ceil(log2(n + 1)) - ceil(log2(m + 1)). Now the implementation rounds the optimal value log2((n * ln(2)) / m) to the nearest integer.
Like SDSL, simple-sds used to create
1 << log2(m)
buckets for the high parts of the values inSparseVector
. This often created unnecessary buckets for positions that did not exist in the bitvector. After this PR, there will only be buckets for positions0
(inclusive) ton
(exclusive), which may reduce the size of the structure significantly.The width of the low parts also used a poor approximation inherited from SDSL:
ceil(log2(n + 1)) - ceil(log2(m + 1))
. Now the implementation rounds the optimal valuelog2((n * ln(2)) / m)
to the nearest integer.