Open mikemccand opened 1 year ago
Lucene is also doing the "accumulate docid deltas into the absolute docid" too in this loop, but I guess Tantivy does this separately somehow?
I believe Tantivy does the same, except that it can take advantage of SIMD to accumulate docid deltas into the absolute docid (if it did not accumulate deltas up-front, it could not run a branchless binary search later on). I tried to look into whether we can do the same with Panama, but last time I checked it doesn't give ways to use _mm_slli_si128
, which prevents us from making a faster prefix sum through vectorization: https://github.com/jpountz/vectorized-prefix-sum.
For reference, there is also this old @mkhludnev idea about encoding dense postings lists as bitsets, which would naturally help with skipping: #6116 (or can we do it on a per-block basis?). And more generally, there are some formats that are better at skipping within blocks like Elias-Fano.
Description
One of the differences between Tantivy and Lucene is that Tantivy supports within-block skipping (branchless binary search to locate a target docid inside a block of 128 postings), but Lucene skips only to block boundaries, and then does a linear scan within. Lucene is also doing the "accumulate docid deltas into the absolute docid" too in this loop, but I guess Tantivy does this separately somehow?
Anyway, we could explore within-block skipping as well -- it might make conjunctive queries where both terms have roughly the same cardinality, quite a bit faster?