split block bloom filters implementation

While i am going through some amazing work you are doing , wanted to get your thoughts on some implementation details or caveats

IIUC SBBF is same as in https://github.com/apache/kudu/blob/master/src/kudu/util/block_bloom_filter.h ? Are there any implementation difference or caveats in your particular approach (any significant perf difference) ? Bit confused around terminology here between block and split block so wanted to understand better if your implementation is something additional on top of classical blocked bloom filter.

I found your repo through : https://arxiv.org/pdf/2101.01719.pdf , it says for the split block bloom fpp are somewhat fixed due to pre-determined number of hash functions used . In your repo example is see double fpp = 0.0065; , is this because we are using a fixed number of hashes and bits per value (pre-configured at the given fpp level) . Am i able to tune this for lower fpp by using more hash functions and may be more bits per value ? What is the current default bits per value set in current SBBF implementation ? which parameter is that ?

In the apache kudu project which uses block bloom filter , here : https://github.com/apache/kudu/blob/master/src/kudu/util/block_bloom_filter.h#L66 it says we are able to get 0.1% fpp with 15 bits per value , is it possible to achieve this in your split block BF implementation as well , or are we stuck to 0.4% - 19% range only ,

thanks for bearing with my naive questions as i am new into this area and still trying to figure out best filter for my case where i have strict latency requirement and need to insert ~10-100 million ids , these ids are not uuid but sequential range [0-100M] , so was trying to use roaring bitmaps before but now looking into this probabilistic filters since constructing filters is a potential bottleneck for me .

jbapple / libfilter

split block bloom filters implementation #22