Open prataprc opened 4 years ago
Hi, @prataprc,
I would like to write the persistent(SerDes) function :)
There are many types of serialization formats. IMHO SerDe wants to Serialize any Rust type to any of those serialization formats.
In this case, I think, we only need binary serialization. So to begin with we can implement a simple encode() decode() API and do SerDe at a later point ?
And thanks for the offer.
https://lemire.me/blog/2019/12/19/xor-filters-faster-and-smaller-than-bloom-filters/ ^ blog post five some idea about serializing the filter.
FWIW, I have another impl of the xor filters in Rust with optional serialization/deserialization with serde behind a feature flag: https://github.com/ayazhafiz/xorf.
Feel free to use that implementation, or we can even merge these two libraries. Let me know what you think.
@ayazhafiz, Just like @prataprc says, currently we need binary serialization only. I would develop a simple file persistent function firstly. Your impl of SerDe is worth for reference.
Feel free to use that implementation, or we can even merge these two libraries. Let me know what you think.
@ayazhafiz thanks for the offer, will give a shout-out when the need arises. Cheers,
For new filter data structure, I would add an upgraded version of the persistent function, which could save new attributes(keys and hash_builder).
IMHO, in case of Xor8, Serialization / De-serialization is only applicable to bitmap-index and its associated fields. That is, we only need those fields required to execute the "contain()" API.
I have tried to scope the problem of handing really large set of keys in #9.
@prataprc Thanks for the explanation, I agree with you.
So that it can be persisted onto disk and retrieved later for membership checks.
Update1: Now serializing and de-serializing Xor8::build_hasher() is more challenging. For instance documentation from std has this to say:
If
RandomState
is used as BuildHasher,std
has got this to sayIf
DefaultHasher
is used as BuildHasher,std
has got this to say,So unless we have a stable BuildHasher type that is stable across releases and across instances, we may not be able to provide a stable serialization and de-serialization API.