Implement Rank/Select - Githubissues

daniel-j-h commented 3 years ago

We should look into implementing rank/select structures for bitset / elias-fano https://github.com/tinygraph/tinygraph/issues/13

For example have a look at

"Space-efficient, high-performance rank and select structures on uncompressed bit sequences" by Zhou, Andersen, and Kaminsky

which seems to be simple and practical.

cc @ucyo

daniel-j-h commented 3 years ago

I have added the building blocks for succinct rank/select data structures

https://github.com/tinygraph/tinygraph/blob/65770f0f8d9cf7acffcd1cb75f6dcc303bbdbd88/tinygraph-bits.h#L8-L26

uses the BMI2 instruction set on modern processors for uint64 rank/select.

From here we can implement e.g. the poppy paper above, or a different approach.

daniel-j-h commented 3 years ago

Alex Bowe has some amazing material from this domain; we should check out his writing

https://raw.githubusercontent.com/alexbowe/wavelet-paper/thesis/thesis.pdf

rank-select-1

daniel-j-h commented 3 years ago

See also this paper describing https://github.com/ot/succinct

Also this thesis going with it.

daniel-j-h commented 3 years ago

This paper https://arxiv.org/abs/1706.00990 describes the machine word select we are using for succinct data structures

https://github.com/tinygraph/tinygraph/blob/893de952a71ed765978693788f89ed6e66de9c68/tinygraph-bits.c#L27-L36

daniel-j-h commented 3 years ago

The paper http://www.cs.cmu.edu/~dga/papers/zhou-sea2013.pdf - termed "poppy" - from

"Space-efficient, high-performance rank and select structures on uncompressed bit sequences" by Zhou, Andersen, and Kaminsky

sounds very practical keeping in mind the realities of hardware architectures and cache hierarchies; they

get down to 3.2% space overhead for rank
get down to 0.39% space overhead for select

and more importantly they also provide a simple basic rank sketch which is simple to implement, has 12.5% space overhead, and allows us then to iterate easily in the future. The first iteration builds up a single index; the second iteration improves on this to get down to 3.2% space overhead for rank.

Their main insights (for rank, since that's what I have read so far) are as follows

cache misses are way more important than optimizing for instructions
a processor cache line is 64 bytes nowadays; use 64 byte "basic blocks"
the popcount instruction is the fastest way to count bits in these basic blocks
build up an index where we store the cumulative sums for each basic block

This means we create an index and store one 64bit uint for every 64 bytes (= 512 bits) in the original bit vector.

During rank query time we use the pre-computed index, plus the remaining bits in the last basic block.

daniel-j-h commented 3 years ago

Adding here

We already provide the machine word popcount functionality in the bits module

https://github.com/tinygraph/tinygraph/blob/893de952a71ed765978693788f89ed6e66de9c68/tinygraph-bits.c#L16-L25

and memory alignment to a 64 byte cache line we can simply do with posix_memalign(&p, 64, size).

daniel-j-h commented 4 months ago

Since opening this issue there has been research to improve on the cs-poppy solution we wanted to start with here.

An improvement on poppy, called "pasta" (see "pasta-flat" in the paper):

Engineering Compact Data Structures for Rank and Select Queries on Bit Vectors

In practice, the smallest (uncompressed) rank and select data structure cs-poppy has a space overhead of ≈ 3.51 % [Zhou et al., SEA ‘13]. Using the same overhead, we present a data structure that can answer queries up to 8 % (rank) and 16.5 % (select) faster compared with cs-poppy.

screen grabs

![pasta1](https://github.com/tinygraph/tinygraph/assets/527241/f609313f-960f-4c7f-8c58-8ad89c1008fe) ![pasta2](https://github.com/tinygraph/tinygraph/assets/527241/921cbadc-c1e9-4f05-9c3c-6ab6825db505)

An improvement on poppy and pasta altogether:

SPIDER: Improved Succinct Rank and Select Performance

However, rank and select query performance still incurs a tradeoff between query time and space. For example, Vigna [ 27 ] gives a data structure for rank queries using 25% space that is roughly 19% faster than pasta-flat, and a data structure for select queries using 12.2% space, which is roughly 65% faster than pasta-flat.

and a few tricks

screen grabs

![spider1](https://github.com/tinygraph/tinygraph/assets/527241/9d363f3e-4bd8-4c2a-a842-00ae4c3b1e55)

daniel-j-h commented 3 months ago

I have a first version ready in https://github.com/tinygraph/tinygraph/pull/41

tinygraph / tinygraph

Implement Rank/Select #14