phiweger / zoo

A portable datastructure for rapid prototyping in (viral) bioinformatics (under development).
5 stars 2 forks source link

create a Redis backend to store/ query an SBT #104

Open phiweger opened 7 years ago

phiweger commented 7 years ago

Redis seems a good choice for Bloom filter implementations due to setbit and getbit, see e.g.

https://github.com/seomoz/pyreBloom

this relates to https://github.com/dib-lab/sourmash/issues/144 (discussion on how to store SBT in sourmash w/ @luizirber

test: given that the average laptop has 8/ 16 gb RAM, how many minhash sigs can we cram in?

workflow: create SBT from collection (or combine existing ones) and save directly to Redis instance, no write to disk. have command line flag to persist to disk in .zip if need be.

zoo sbt_index ... --cells couple,of,collections --persist <redis instance> zoo sbt_search ... <redis instance>

or propose pull request to extend current SBT code in sourmash directly