Closed elshize closed 1 year ago
It seems that recent changes to the pop_heap
implementation in libc++ assume that the entire range [first, last)
is a valid heap, including the last element that we are pushing. I'm trying to figure out if it's a bug, or if the standard in fact requires that.
On another note, the implementation of pop_heap
in both libc++ and libstdc++ use "heapsort with bounce" approach to popping. That is, first a full sift down is performed to find the leaf, and then the heap is fixed by sifting up if necessary (https://en.wikipedia.org/wiki/Heapsort#Bottom-up_heapsort). The idea is that this takes fewer comparisons (see wiki article for details) because the value we sift down-and-up is in the leaf. But in our case this value can be anything that is larger than the top of the heap, so not sure how much we're gaining by doing this hack.
I want to actually do some benchmarks for this.
LLVM commit introducing the change: https://github.com/llvm/llvm-project/commit/79d08e398c17e83b118b837ab0b52107fd294c3e
LLVM discourse thread I started to find out more: https://discourse.llvm.org/t/binary-heap-pop-push-operation-using-pop-heap/67430
Ok, so it turns out that the standard does state that the entire range must be a valid heap: http://eel.is/c++draft/pop.heap#2
The fix itself is trivial (need to do a pop before appending the new value, and then swap the old with new, and call push), but I will (a) look into whether there is a more efficient alternative, and (b) do some benchmarking.
We have some property tests implemented that check
topk_queue
thresholds, and they mysteriously fail on Clang 15.