mavam / libbf

:dart: Bloom filters for C++11
http://mavam.github.io/libbf
BSD 3-Clause "New" or "Revised" License
356 stars 89 forks source link

Created a storage method to access bitvector #11

Closed mborger closed 9 years ago

mborger commented 9 years ago

This adds an interface for getting the underlying bitvector for a basic bloom filter. This is in response to this pull request.

mavam commented 9 years ago

Thanks for contributing!

amallia commented 7 years ago

Is there a way to create a new BF given its storage? Lets say I have:

bf::basic_bloom_filter b(0.8, 100);
b.add("foo");
b.add(42);
bitvector bv = b.storage();

It would be nice to be able to do:

hasher h = ...;
bf::basic_bloom_filter b(h, bitvector);

@mavam would you agree if I open a PR to have this constructor?

mavam commented 7 years ago

Initially, the function .storage() was only meant to provide const-access to into the internals, mostly for debugging and introspection. Turns out it's being used as a serialization mechanism.

@mavam would you agree if I open a PR to have this constructor?

That'd be great! Contributions would be very welcome, and are in fact the only way to move this library forward right now.

That said, I don't think a constructor overload would make sense, because there's no way to fail in a nice way when the bitvector doesn't make sense (e.g., has size 0). Jonathan sums up the issue quite nicely in a blog post: http://foonathan.net/blog/2017/01/09/exceptions-constructor.html.

Since we don't want to throw exceptions, we should instead provide a factory function returning something along the lines of optional<T>. But since libbf doesn't require a C++17 standard library, we don't have that luxury yet. I'd say we provide an API compliant drop-in (can be copied from many implementations) in combination with a static member function T::make(hasher h, bitvector b) for a Bloom filter of type T.

amallia commented 7 years ago

Thanks for the answer @mavam! Yes the idea is to be able to serialize the bf for storage purposes.

To sum up your proposal: we only have a static function T::make(hasher h, bitvector b) which implements the creation of a bf given the hasher and the storage (bitvector). What is the reason of templates in this particular case if this is static?

mavam commented 7 years ago

To sum up your proposal: we only have a static function T::make(hasher h, bitvector b) which implements the creation of a bf given the hasher and the storage (bitvector).

Yep.

What is the reason of templates in this particular case if this is static?

Sorry if this wasn't clear: I just used T as a placeholder for an arbitrary Bloom filter type, since the library provides many. I didn't mean to imply that T is a template. I'm proposing to add such a function to all types where it makes sense.

amallia commented 7 years ago

All clear! I will make a PR asap. I am a bit busy in this period but I need this functionality.

It is good to know that PRs are well-accepted and the owner of the repo is collaborative. Thanks a lot @mavam, I appreciate!

amallia commented 7 years ago

Added PR https://github.com/mavam/libbf/pull/23

mavam commented 7 years ago

Great. If you're willing to contribute even more, I'm happy to share ideas on how to move forward with this library. For example, the next version should have a more flexible way of hashing according to N3980, which I started in topic/n3980. If you're interested in guided reviews and discussion, let me know.

amallia commented 7 years ago

I am definitely interested in contributing more. Let's create issues for that.