BurntSushi / suffix

Fast suffix arrays for Rust (with Unicode support).
The Unlicense
263 stars 30 forks source link

Any interest in string B trees? #11

Closed chadbrewbaker closed 4 years ago

chadbrewbaker commented 4 years ago

After his linear time suffix array work, Pang Ko came up with a string B-tree data structure for cache oblivious access:

https://lib.dr.iastate.edu/rtd/15942/

Any interest in a PR for a string B-tree or should i make a new repository? Ideally it would handle RAM, local disk, and a network store like AWS S3. I know at least one team at Amazon that would use this internally at scale. Toss in a Bloom filter on the network store shards and it could save obscene amounts of IO for cold reads.

BurntSushi commented 4 years ago

Best. But this is a crate for suffix arrays. Not sure why you would think btrees and AWS support would belong in it. So I'd say start a new project. :)

chadbrewbaker commented 4 years ago

Lol. Nah, just a generic interface for layers beyond RAM with a reference implementation for POSIX files. Pulling in every network data store under the sun would be way out of scope.