Closed GoogleCodeExporter closed 9 years ago
This code needs to implement something other than
std::lower_bound. I suspect there are misleading
comments in the code that are leading you astray.
In particular, the block stores multiple keys
between each restart point. And what the binary
search needs to do is find the largest restart
point with a key < target so we can start
decoding from there and find any occurrence of
target that might occur between restart points.
Consider the following example with target==250.
restart point 0: key == 100
... some keys in range [100..199] here ...
restart point 1: key == 200
... some keys in range [200..299] here ...
restart point 2: key == 300
... some keys in range [300..inf] here ...
In this example, we would want the binary search
to find restart point 1 (key == 200) and then
sequentially decode from there.
Now consider your proposed code.
First iteration:
left = 0
right = 3
mid = (left + right) / 2 == 1
Compare(mid_key, target) == Compare(200, 250)
The comparison returns < 0, so we do:
left = mid + 1 = 2
We have skipped past the correct result of binary
search (restart point 1). So we will end up starting
the decoding process too late in the block.
Sorry about the misleading comment. I will change
the following comment:
// Binary search in restart array to find the first restart point
// with a key >= target
to:
// Binary search in restart array to find the last restart point
// with a key < target
Original comment by san...@google.com
on 16 Jul 2012 at 5:04
Original issue reported on code.google.com by
yangzhuo...@gmail.com
on 15 Jul 2012 at 1:36