dib-lab / khmer

In-memory nucleotide sequence k-mer counting, filtering, graph traversal and more
http://khmer.readthedocs.io/
Other
747 stars 294 forks source link

Investigate RRR encoding for our Bloom filters (and CountMin Sketches?) #1074

Open ctb opened 9 years ago

ctb commented 9 years ago

From Sequence Bloom Trees, http://biorxiv.org/content/early/2015/03/26/017087, ref 17:

[17] Rajeev Raman, Venkatesh Raman, and S. Srinivasa Rao. Succinct indexable dictionaries with applications to encoding k-ary trees and multisets. In Proceedings of the Thirteenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA ’02, pages 233–242, Philadelphia, PA, USA, 2002. Society for Industrial and Applied Mathematics.

Apparently this permits O(log N) querying of Bloom filters. Might have to be adapted for our implementation.

ctb commented 8 years ago

keyword: compression.