seqan / hibf

HIBF and IBF
https://docs.seqan.de/hibf
Other
4 stars 2 forks source link

Investigative Todos #89

Open eseiler opened 1 year ago

eseiler commented 1 year ago

membership_for

I noticed that each membership_for call will call counting_agent on each IBF that it traverses. And each counting_agent will allocate memory, etc.

Since membership_for will probably be called multiple times (multiple queries to one HIBF), it might be beneficial to store the counting agents.

This needs some more benchmarking. It might already be beneficial when conducting 4096 queries on an HIBF with 4096 user bins. In this case, there was a 10% speed-up, and it did not slow down in other cases.

Click to show diff ```diff diff --git a/include/hibf/hierarchical_interleaved_bloom_filter.hpp b/include/hibf/hierarchical_interleaved_bloom_filter.hpp index b3ff075..bdd300f 100644 --- a/include/hibf/hierarchical_interleaved_bloom_filter.hpp +++ b/include/hibf/hierarchical_interleaved_bloom_filter.hpp @@ -196,11 +196,14 @@ private: //!\brief A pointer to the augmented hierarchical_interleaved_bloom_filter. hierarchical_interleaved_bloom_filter const * const hibf_ptr{nullptr}; + //!\brief Stores counting_agents of all IBFs. + std::vector> agents{}; + //!\brief Helper for recursive membership querying. template void membership_for_impl(value_range_t && values, int64_t const ibf_idx, size_t const threshold) { - auto agent = hibf_ptr->ibf_vector[ibf_idx].template counting_agent(); + auto & agent = agents[ibf_idx]; auto & result = agent.bulk_count(values); uint16_t sum{}; @@ -243,7 +246,12 @@ public: * \param hibf The hierarchical_interleaved_bloom_filter. */ explicit membership_agent_type(hierarchical_interleaved_bloom_filter const & hibf) : hibf_ptr(std::addressof(hibf)) - {} + { + size_t const number_of_agents = hibf_ptr->ibf_vector.size(); + agents.reserve(number_of_agents); + for (size_t i = 0; i < number_of_agents; ++i) + agents.emplace_back(hibf_ptr->ibf_vector[i].template counting_agent()); + } //!\} //!\brief Stores the result of membership_for(). ```

Other