Open zaidoon1 opened 3 days ago
looks like something similar was requested https://groups.google.com/g/rocksdb/c/bb6Db8Y3xwU
@ajkr What do you think about a feature like this? It seems like it's very useful/high impact, but i'm not sure the level of effort is?
say my key format is
<account_id>:<user_id>:<some dynamic value>
today, we can create a prefix extractor/bloom on: to help with queries that start with some known
<account_id>:<user_id>
, HOWEVER, what we can't do today is ALSO setup a prefix extractor on<account_id>
this way, I can use bloom filters on queries that happen to know the account id + user id combination as well as the queries that only happen to have an account id. Effectively, in db/sql terminology, this is like being able to create multiple indexes on the "columns" to optimize queries like:select * from blah where account_id = 123
&select * from blah where account_id = 345 and user_id = 678
As far as I know, today we can only have one prefix extractor/bloom per cf so we have the following workarounds which are not ideal:
create another cf that duplicates the data, so that one cf has
<account_id>:<user_id>
prefix extractor and the other has<account_id>
prefix extractor and depending on the query/what we already know, we will lookup the kv from the corresponding cf. The issue here is we need to use more disk space to store the duplicate dataGiven
<account_id>
is common between both prefix extractors (in this use case) and we always have this, we use this as the prefix extractor, however, we miss on the opportunity to optimize queries that also have<user_id>