HierarchicalBitset is intended for fast iteration and performance critical sections. It can be iterated with the take function, or with iterators. Right now, it works, but it's rather branchy and recursive. It hadn't been benchmarked, but it's likely slower than a linear search in some cases.
The take function clears bits during iteration and is also recursive. The iterators use the next function (also recursive) every single iteration, which is a very slow way to iterate the bits of a single block (usually 64 bit ints).
The "count trailing zeros" intrinsic is used to speed up finding the first set bit in a block. This is practical and fast, and the code can be revised to use it more efficiently.
HierarchicalBitset is slow enough to show up in the profiler. It's probably faster than iterating an std::set, but the goal is to make it faster than iterating a sorted array of integers.
HierarchicalBitset is intended for fast iteration and performance critical sections. It can be iterated with the
take
function, or with iterators. Right now, it works, but it's rather branchy and recursive. It hadn't been benchmarked, but it's likely slower than a linear search in some cases.The
take
function clears bits during iteration and is also recursive. The iterators use thenext
function (also recursive) every single iteration, which is a very slow way to iterate the bits of a single block (usually 64 bit ints).The "count trailing zeros" intrinsic is used to speed up finding the first set bit in a block. This is practical and fast, and the code can be revised to use it more efficiently.
However, there is also SIMD: https://stackoverflow.com/questions/49213611/count-leading-zeros-in-m256i-word https://lemire.me/blog/2018/03/08/iterating-over-set-bits-quickly-simd-edition/
Turning these into complaint iterators does not seem to be a trivial task though.