linvon / cuckoo-filter

Cuckoo Filter go implement, better than Bloom Filter, configurable and space optimized 布谷鸟过滤器的Go实现,优于布隆过滤器,可以定制化过滤器参数,并进行了空间优化
MIT License
294 stars 28 forks source link

Enhance Support for Larger Datasets and Buckets in Encoding #11

Open EladGabay opened 1 year ago

EladGabay commented 1 year ago

This commit improves encoding by enabling the handling of number of items and buckets exceeding max(uint32). Formerly, the encoding used uint32 for counts, but the filter structure already supported larger values using uint. Until now, the filter partially supported larger datasets, not all the buckets were utilized, note to the change in generateIndexTagHash, altIndex and indexHash.

Now, all references to bucket indices and item counts explicitly use uint64. A new encoding format accommodates larger filter. To distinguish between legacy (up to max(uint32) items) and the new format, a prefix marker is introduced.

Decoding seamlessly supports both formats. The encode method takes a legacy boolean parameter for gradual adoption.

EladGabay commented 1 year ago

@linvon would you like to take a look? 😊

linvon commented 1 year ago

@linvon would you like to take a look? 😊

Sorry, busy with work, but I will find some time to handle this

EladGabay commented 1 year ago

Hi, @linvon , let me know if you need any help :)

EladGabay commented 1 year ago

@linvon gentle ping

EladGabay commented 10 months ago

Hi @linvon do you think it's going to be merged soon? 🙏