I noticed that each membership_for call will call counting_agent on each IBF that it traverses.
And each counting_agent will allocate memory, etc.
Since membership_for will probably be called multiple times (multiple queries to one HIBF), it might be beneficial to store the counting agents.
This needs some more benchmarking. It might already be beneficial when conducting 4096 queries on an HIBF with 4096 user bins. In this case, there was a 10% speed-up, and it did not slow down in other cases.
Click to show diff
```diff
diff --git a/include/hibf/hierarchical_interleaved_bloom_filter.hpp b/include/hibf/hierarchical_interleaved_bloom_filter.hpp
index b3ff075..bdd300f 100644
--- a/include/hibf/hierarchical_interleaved_bloom_filter.hpp
+++ b/include/hibf/hierarchical_interleaved_bloom_filter.hpp
@@ -196,11 +196,14 @@ private:
//!\brief A pointer to the augmented hierarchical_interleaved_bloom_filter.
hierarchical_interleaved_bloom_filter const * const hibf_ptr{nullptr};
+ //!\brief Stores counting_agents of all IBFs.
+ std::vector> agents{};
+
//!\brief Helper for recursive membership querying.
template
void membership_for_impl(value_range_t && values, int64_t const ibf_idx, size_t const threshold)
{
- auto agent = hibf_ptr->ibf_vector[ibf_idx].template counting_agent();
+ auto & agent = agents[ibf_idx];
auto & result = agent.bulk_count(values);
uint16_t sum{};
@@ -243,7 +246,12 @@ public:
* \param hibf The hierarchical_interleaved_bloom_filter.
*/
explicit membership_agent_type(hierarchical_interleaved_bloom_filter const & hibf) : hibf_ptr(std::addressof(hibf))
- {}
+ {
+ size_t const number_of_agents = hibf_ptr->ibf_vector.size();
+ agents.reserve(number_of_agents);
+ for (size_t i = 0; i < number_of_agents; ++i)
+ agents.emplace_back(hibf_ptr->ibf_vector[i].template counting_agent());
+ }
//!\}
//!\brief Stores the result of membership_for().
```
Other
uint64_t for UB indices, use max value as merged bin indicator
membership_for
I noticed that each
membership_for
call will callcounting_agent
on each IBF that it traverses. And eachcounting_agent
will allocate memory, etc.Since
membership_for
will probably be called multiple times (multiple queries to one HIBF), it might be beneficial to store the counting agents.This needs some more benchmarking. It might already be beneficial when conducting 4096 queries on an HIBF with 4096 user bins. In this case, there was a 10% speed-up, and it did not slow down in other cases.
Click to show diff
```diff diff --git a/include/hibf/hierarchical_interleaved_bloom_filter.hpp b/include/hibf/hierarchical_interleaved_bloom_filter.hpp index b3ff075..bdd300f 100644 --- a/include/hibf/hierarchical_interleaved_bloom_filter.hpp +++ b/include/hibf/hierarchical_interleaved_bloom_filter.hpp @@ -196,11 +196,14 @@ private: //!\brief A pointer to the augmented hierarchical_interleaved_bloom_filter. hierarchical_interleaved_bloom_filter const * const hibf_ptr{nullptr}; + //!\brief Stores counting_agents of all IBFs. + std::vectorOther
uint64_t
for UB indices, use max value as merged bin indicator