redpanda-data / redpanda

Redpanda is a streaming data platform for developers. Kafka API compatible. 10x faster. No ZooKeeper. No JVM!
https://redpanda.com
9.46k stars 579 forks source link

spill_key_index: Avoid repeated begin() calls #19836

Closed StephanDollberg closed 3 months ago

StephanDollberg commented 3 months ago

The pattern of calling absl::*_hash_map::begin() in a loop is known to be inefficient. Each time it will have to find the first non-empty element in the control array.

This results in a O(N^2) complexity because each time we re-walk the beginning of the array.

Instead keep iterating over the map such that we traverse the control array only once.

In this case the map will never shrink so we can use the capacity as a generation counter to avoid concurrent modifaction and iterator invalidation.

Backports Required

Release Notes

travisdowns commented 3 months ago

@StephanDollberg wrote:

The pattern of calling absl::*_hash_map::begin() in a loop is known to be inefficient. Each time it will have to find the first non-empty element in the control array.

Hash iteration is more expensive than say array-like tree or list-like, but are you saying that begin() is more expensive than say a hash_map increment? My mental model is that they are the same: both involve a "find next element" operation, with a variable cost.

This results in a O(N^2) complexity

~Hmm, in the worst case? Assuming a fixed (or lower-bounded) load factor it feels like O(1) in expectation to me for begin() (so O(N) overall, the expected location of the first element is an arithmetico-geometric sequence so has a fixed value depending only on p (the chance of any individual element being non-empty).~

This assumes random distribution in the hash map, which could be violated by stuff like removing elements "randomly" via iteration which ends up preferring earlier elements.

Added: Right, so we are doing exactly this pattern of removing elements from the start of the hash map which causes this problem.

StephanDollberg commented 3 months ago

@andrwng I am wondering now, do you know off the top of your head (my clangd is failing me) whether there can be concurrent spill_key_index::index calls? I had assumed yes but I see now that spill calls into segment_appender::append which requires there only being one outstanding one so this would feel like a bug already.

andrwng commented 3 months ago

whether there can be concurrent spill_key_index::index calls

I don't believe so, given we append from a single fiber (the Raft append fiber). That said, I agree with Travis that it's a pain to uphold (without introducing some additional locking).

StephanDollberg commented 3 months ago

I don't believe so, given we append from a single fiber (the Raft append fiber).

Right so I think the PR is technically fine but I will close it anyway. It's not worth the effort.

Have raised https://github.com/redpanda-data/core-internal/issues/1316