The problem is a theoretical problem. When searching a secondary index, it's
possible that we:
(1) Forward a particular record <SK, PK> from the 2ndary index search operator
(2) In the meantime, that record is deleted and reinserted as <SK', PK>. This
is possible because we do not lock during secondary index searches.
(3) If the newly inserted record is inserted "ahead" of the search cursor, then
the search cursor will return it twice. It can be inserted ahead for a number
of reasons (e.g. SK' > SK in a B+-tree or if SK' matches an R-tree predicate
and lands on a page that the cursor hasn't visited yet).
The primary index is probed for each record returned from the 2ndary index
search. This means that both <SK, PK> and <SK', PK> are used as probes to the
primary index (i.e. we retrieve the record with <PK> twice). As long as SK'
matches the original predicate, then we will return the record twice. This is
incorrect behavior!
Fortunately, the fix is simple: use a duplicate eliminating sort on PK between
secondary and primary indexes. We already perform the sort, but it just needs
to be made to eliminate duplicates.
Original issue reported on code.google.com by zheilb...@gmail.com on 11 Mar 2014 at 10:08
Original issue reported on code.google.com by
zheilb...@gmail.com
on 11 Mar 2014 at 10:08