masked_documents_registry::make() should build a bloom filter from all updated_documents and also track the [min, max) document IDs across all of them.
This is so that masked_documents_registry::test() can check in O(1) if a document is definitely not any of the tracked scanners. This could save potentially dozens of iterations and checks against them.
updated_documents should include a fixed-size bloom filter, created by Trinity::pack_updates(). We already track the [lowest, highest] for them.
masked_documents_registry::make()
should build a bloom filter from allupdated_documents
and also track the [min, max) document IDs across all of them. This is so thatmasked_documents_registry::test()
can check in O(1) if a document is definitely not any of the tracked scanners. This could save potentially dozens of iterations and checks against them.updated_documents
should include a fixed-size bloom filter, created byTrinity::pack_updates()
. We already track the [lowest, highest] for them.