Open JamesA212 opened 11 months ago
I believe that as long as you've enabled bloom filters that this is equivalent to opening a prefix iterator with an iterate_upper_bound
, and then checking whether it is valid.
you would still need to seek which is not as efficient as it can be, there is an optimization that can be done where we don't have to seek to the first key. See https://github.com/facebook/rocksdb/issues/11644#issuecomment-2004766793, effectively this would be equivalent to
select * from x where y like 'blah%' limit 1;
To implement this query in rocksdb today, you would need to seek to the first kv that matches but if we say, I don't care what kv is retrieved as long as it starts with blah. Then we can get our results much faster
I'm not sure that that is true. There is no complete list of keys except in the sstable, so I don't think that you can know that the key actually exists without seeking to it. The bloom filter can't help you with that.
The bloom filter can't help you with that.
Right, in this case we have a prefix bloom, etc.. However, you can't fit everything in the bloom so for the cases where the bloom is a MISS and we have to start opening files, to find the kvs, one way to do it (sounds easier than it actually is) is to stop opening files once we find any kv that matches the prefix instead of continuing to open files/seeking to first kv.
The optimization I'm hoping to get is for the MISSes
Saw this feature being filed https://github.com/speedb-io/speedb/issues/685. Given speedb is just a fork of rocksdb, rocksdb should be able to implement some efficient way to check if a prefix matches any key in the cf/db?