phaistos-networks / Trinity

Trinity IR Infrastructure
Apache License 2.0
235 stars 20 forks source link

Faster checks against masked documents #14

Closed markpapadakis closed 6 years ago

markpapadakis commented 6 years ago

masked_documents_registry::make() should build a bloom filter from all updated_documents and also track the [min, max) document IDs across all of them. This is so that masked_documents_registry::test() can check in O(1) if a document is definitely not any of the tracked scanners. This could save potentially dozens of iterations and checks against them.

updated_documents should include a fixed-size bloom filter, created by Trinity::pack_updates(). We already track the [lowest, highest] for them.

markpapadakis commented 6 years ago

Implemented