cockroachdb / pebble

RocksDB/LevelDB inspired key-value database in Go
BSD 3-Clause "New" or "Revised" License
4.91k stars 455 forks source link

perf: consider constructing batch skiplist at iterator construction #1944

Open jbowens opened 2 years ago

jbowens commented 2 years ago

An indexed batch maintains a skiplist of its entries. Maintaining this skiplist adds significant overhead to indexed batches. When an iterator is created to read through an indexed batch, it creates an iterator over the skiplist without needing to re-index all the batch's data.

In #1640 the semantics of batch mutations were changed such that mutations to the batch after iterator construction are only observed if the iterator is explicitly refreshed (through SetOptions). The batch's skiplist is still updated with every batch mutation. If many batch modifications are made without refreshing an iterator's view of the batch, the skiplist contains many irrelevant entries. To hide these irrelevant entries from view, the batchIter must skip over them. This can result in pathological behavior where an iterator positioning operation must skip over the entirety of the skiplist in order to observe that there are no entries that are visible to the iterator.

Given the new batch semantics introduced in #1640, it might be more performant to push construction of the skiplist into NewIter and SetOptions. This would a) remove the overhead of skiplist updates on batch mutations and b) remove the pathological behavior around non-recent batch entries. It comes at the cost of more expensive iterator construction/SetOptions. The SetOptions calls can be optimized to update the iterator's existing skiplist with only the recent entries that were added since the iterator constructor or the previous SetOptions call. New iterator constructions however would need to pay the cost of indexing the entirety of the batch.

Informs #1942. Informs cockroachdb/cockroach#87277.

Jira issue: PEBBLE-146

sumeerbhola commented 2 years ago

If we have multiple iterators constructed on the Batch over time, and they are all open, does it mean they will not share a Skiplist?

jbowens commented 2 years ago

Yeah, it does. Or with some extra bookkeeping, we could allow separate iterators with similar views of the batch's contents (eg, batch offsets that are relatively close) to share a skiplist.

github-actions[bot] commented 8 months ago

We have marked this issue as stale because it has been inactive for 18 months. If this issue is still relevant, removing the stale label or adding a comment will keep it active. Otherwise, we'll close it in 10 days to keep the issue queue tidy. Thank you for your contribution to Pebble!