Add `rsdic` combined rank and select data structure

sujayakar commented 5 years ago

I've ported @hillbig's implementation [1] of Navarro and Providel's data structure [2] for simultaneously accelerating rank and select queries over a bit vector. This structure is great for select heavy workloads, like wavelet matrices [3] that currently binary search over ranks. The shape of the implementation is largely the same but with some small changes and Rust idioms.

The testing is all through quickcheck by testing all inputs for rank and select against a naive implementation.

I've also added some basic benchmarks, and the results are already pretty good. This benchmark generates a random 1M bitvector and queries the rank at 1000 different indices.

rsdic::rank             time:   [16.738 us 16.957 us 17.196 us]
jacobson::rank          time:   [21.522 us 21.721 us 21.958 us]
rank9::rank             time:   [7.7177 us 7.7851 us 7.8541 us]
rsdic::select0          time:   [31.906 us 32.131 us 32.372 us]
rsdic::select1          time:   [34.378 us 34.956 us 35.641 us]
rank9::binsearch::select0
                        time:   [296.19 us 302.05 us 307.71 us]
rank9::binsearch::select1
                        time:   [291.32 us 298.60 us 305.48 us]

So, this data structure is faster than Jacobson for rank but slower than rank9, and it's much faster than binary search for my given benchmark.

Then, I added SIMD acceleration for the rank computation, and it's closer to rank9 now. It's behind an experimental feature flag since that whole ecosystem isn't fully settled yet. Here's the benchmarks rerun with the feature and RUSTFLAGS="-C target-cpu=native" on my 2018 MBP.

rsdic::rank             time:   [8.6908 us 8.7916 us 8.9432 us]
jacobson::rank          time:   [18.180 us 18.654 us 19.135 us]
rank9::rank             time:   [7.3820 us 7.5698 us 7.7660 us]
rsdic::select0          time:   [43.371 us 44.225 us 45.118 us]
rsdic::select1          time:   [35.664 us 36.867 us 38.243 us]
rank9::binsearch::select0
                        time:   [263.92 us 268.39 us 273.62 us]
rank9::binsearch::select1
                        time:   [257.19 us 263.10 us 269.85 us]

Not all of the speed up is just from changing target-cpu: Here's the same benchmark without the SIMD acceleration feature.

rsdic::rank             time:   [13.612 us 13.684 us 13.769 us]

[1] https://github.com/hillbig/rsdic [2] https://users.dcc.uchile.cl/~gnavarro/ps/sea12.1.pdf [3] https://github.com/sekineh/wavelet-matrix-rs

tov commented 4 years ago

Cool stuff! I saw you just made a change. Is this ready to merge?

sujayakar commented 4 years ago

Yep, I think so! I have some local changes to vectorize select and add a builder to speed up construction, but I can submit subsequent PRs for those :)

sujayakar commented 4 years ago

@tov, ping on this? Let me know if there's anything I can help with.

tov commented 4 years ago

Nope, sorry, I just haven’t had I time to look at it yet! Should get to it quite soon.

On Tue, Nov 19, 2019 at 12:32 PM Sujay Jayakar notifications@github.com wrote:

@tov https://github.com/tov, ping on this? Let me know if there's anything I can help with.

— You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub https://github.com/tov/succinct-rs/pull/10?email_source=notifications&email_token=AAFS3BBV5ZUAJHGZ6QHJRTTQUQWKDA5CNFSM4JHFDJDKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEEPHTPQ#issuecomment-555645374, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFS3BENBVWU4CXL5OXBVBLQUQWKDANCNFSM4JHFDJDA .

-- Dr. Jesse A. Tov Assistant Professor of Instruction Computer Science McCormick School of Engineering Northwestern University

http://www.cs.northwestern.edu/~jesse/

sujayakar commented 4 years ago

hey @tov, I'm going to hack on this a bit over the holidays anyways, so I'll put it up in a separate repo and we can chat about merging later if you'd like.

tov commented 4 years ago

Okay, please keep me posted. Over the holidays is when I have time as well…

sujayakar commented 4 years ago

Cool, I've put up the library at https://github.com/sujayakar/rsdict and fixed most of the issues I wanted to (docstrings, from_blocks constructor, runtime CPU feature detection), so take a look when you have time. Shall I resubmit a PR?

tov commented 4 years ago

Yes, please do! Sorry again for the delay.

sujayakar commented 4 years ago

hmm, looks like packed_simd only supports nightly Rust. @tov, would you rather we put SIMD support behind a feature gate or only support nightly for this crate?

Edit: I just added the feature gate :)

tov commented 4 years ago

+1 to the feature gate.

sujayakar commented 4 years ago

@tov, this should be ready for another review!

ucyo commented 4 years ago

@sujayakar There seems to be a change in behaviour with RsDict. The rank1(pos) operation does not consider the bit at pos anymore. Minimal example:

    let keys = vec![0,2,4,5];
    let mut bv: BitVector<u64> = BitVector::with_fill(m as u64 + 1, false);
    for k in keys {
        bv.set_bit(k as u64, true);
    }
    let mut rs = RsDict::new();
    for bit in bv.iter() {
        rs.push(bit);
    }
    let jj = Rank9::new(bv);
    println!("bv {} rs {}", jj.rank1(0), rs.rank1(0));  // jj returns 1, while rs returns 0

sujayakar commented 4 years ago

hi @ucyo, I've put up this code here and it's on crates.io too. There's a rank_inclusive method that considers the bit at pos as well.

tov commented 4 years ago

Did you close this because I've been slow to respond, or because you decided it would be better in a separate crate? I’m back now, but I’d also be happy to share maintenance duties here.

sujayakar commented 4 years ago

hey @tov, yeah, I just wanted to wrap up that side project of mine.

I'd be happy to merge the library back into succinct-rs if you're okay with making me a maintainer. That way I can maintain rsdict with shorter cycle times, and I'd also love to improve the other structures in this library. I'm actually a bit busy at the moment but would pick that up when I have more time.

tov / succinct-rs

Add `rsdic` combined rank and select data structure #10