dib-lab / pybbhash

A Python wrapper for the bbhash library for Minimal Perfect Hashing
Other
18 stars 4 forks source link

support generic iterators and/or input numpy arrays in constructor. #17

Open ctb opened 3 years ago

ctb commented 3 years ago

I would like to be able to use generic Python iterators in the PyMPHF construction. Right now there is a round of memory-inefficient copying of hashes, which is bad when you have a lot of k-mers!

It would also be nice to support more memory efficient input data structures than Python sets or lists... we already depend on numpy, so maybe a nice 1-D ndarray or a data buffer? see internals of numpy.