Closed amallia closed 5 years ago
They describe a bit vector whereas we implement their algorithm to generate an array of sorted integers. So when they split A[l...r] into A[l ...m] and A[m+1...r], they mean to divide the universe (1...r) into two subuniverses (l...m) and (m+1...r).
It is always possible, of course, that my implementation can be improved. If so, a pull request with an analysis would be welcome.
https://github.com/lemire/FastPFor/blob/71d54a9793245ae90e69c86a425d4ee1ee6543d8/headers/synthetic.h#L102-L128
The above is the synthetic data generator for a clustered series. The reference is "Vo Ngoc Anh and Alistair Moffat. 2010. Index compression using 64-bit words".
The original paper says the following:
In the source code the
cut
(which corresponds tom
in the paper) is never used to split the vector, but just to adaptmin
/max
of the recursive calls.