yugabyte / yugabyte-db

YugabyteDB - the cloud native distributed SQL database for mission-critical applications.
https://www.yugabyte.com
Other
9k stars 1.07k forks source link

[DocDB] Avoid creating separate iterator for tombstone check #17843

Open spolitov opened 1 year ago

spolitov commented 1 year ago

Jira Link: DB-6931

Description

The colocated table could be truncated. In this case we add tombstone record to appropriate tablet. But we have to check the latest tombstone time during every read.

But we also use bloom filters to pick correct .sst files when creating rocksdb iterator. So it could happen that during primary key lookup, this key and tombstone record will located in different files. And file with tombstone record will be filtered out by bloom filer for primary key. This issue was found as #15206 and was fixed by using separate iterator for checking for tombstone record.

But iterator creation time is much greater that lookup itself. So we waste a lot of time by creating 2 iterators.

To avoid creating second iterator we could enhance bloom filter logic. So we will pass multiple keys to bloom filter. And it will accept all files that could contain any of specified keys. So we will be able to create iterator that picks all files containing either primary key or tombstone record.

Warning: Please confirm that this issue does not contain any sensitive information

Huqicheng commented 1 year ago

We modify/extend BloomFilterAwareFileFilter to filter files with multiple keys.