Open petermattis opened 5 years ago
Makes sense. I wonder if there are additional requirements on SeekGE
and SeekLT
. For example, before I believe the requirement on SeekGE
was the seek key had to be at least the lower bound. Now I expect it'll also need to be at most the upper-bound. Otherwise the binary search could land after the saved upper-bound pointer and forward scan could proceed obliviously. I'd guess adding such a requirement (if we don't already have it) is fine though.
I wonder if there are additional requirements on
SeekGE
andSeekLT
. For example, before I believe the requirement onSeekGE
was the seek key had to be at least the lower bound. Now I expect it'll also need to be at most the upper-bound. Otherwise the binary search could land after the saved upper-bound pointer and forward scan could proceed obliviously. I'd guess adding such a requirement (if we don't already have it) is fine though.
Yes, I think we would need additional checks as you suggest.
@Ryanfsdf This is something to experiment with. It isn't clear is worthwhile or not, so the first step would be to do some quick implementation and write benchmarks to see if there is an improvement. The code in question is in internal/arenaskl
.
The memtable currently checks the lower/upper bound on every iteration step. An alternative is to find the nodes which bound the lower and upper bound (using
SeekGE(lower)
andSeekLT(upper)
)
I think you meant SeekGE(upper)
and SeekLT(lower)
here, unless I'm misunderstanding something?
Makes sense. I wonder if there are additional requirements on
SeekGE
andSeekLT
. For example, before I believe the requirement onSeekGE
was the seek key had to be at least the lower bound. Now I expect it'll also need to be at most the upper-bound. Otherwise the binary search could land after the saved upper-bound pointer and forward scan could proceed obliviously. I'd guess adding such a requirement (if we don't already have it) is fine though.
I don't think we need the extra requirement, since SeekGE
enforces the upper-bound in the code, so the binary search would never land after the saved upper-bound pointer.
I think you meant SeekGE(upper) and SeekLT(lower) here, unless I'm misunderstanding something?
Yep. Thanks for the correction.
We have marked this issue as stale because it has been inactive for 18 months. If this issue is still relevant, removing the stale label or adding a comment will keep it active. Otherwise, we'll close it in 10 days to keep the issue queue tidy. Thank you for your contribution to Pebble!
We have marked this issue as stale because it has been inactive for 18 months. If this issue is still relevant, removing the stale label or adding a comment will keep it active. Otherwise, we'll close it in 10 days to keep the issue queue tidy. Thank you for your contribution to Pebble!
The memtable currently checks the lower/upper bound on every iteration step. An alternative is to find the nodes which bound the lower and upper bound (using
SeekGE(lower)
andSeekLT(upper)
) and then do pointer comparisons for the boundary checks. This will make the boundary checks a lot faster, but it does mean that we'll do an extraSeek*
(we can do this lazily, but there will still be at least 1 extra seek). Note that the memtable can be inserted into concurrently with iteration, but those concurrent insertions are always for newer values and cannot affect the keys seen during an iteration.@ajkr I was thinking about this after your comment on https://github.com/petermattis/pebble/pull/81#pullrequestreview-231332657. Thoughts?
Jira issue: PEBBLE-184