ParAnd should have O(N) complexity where N is the number of containers - Githubissues

RoaringBitmap / roaring

Roaring bitmaps in Go (golang), used by InfluxDB, Bleve, DataDog

http://roaringbitmap.org/

Apache License 2.0

2.52k stars 230 forks source link

ParAnd should have O(N) complexity where N is the number of containers #110

Open lemire opened 7 years ago

lemire commented 7 years ago

It seems that the current ParAnd implementation has O(N log(B)) complexity where N is the number of containers and B is the number of bitmaps. It should be possible to ensure that the complexity is O(N).

cc @maciej

lemire commented 7 years ago

In particular, if you intersect one empty bitmap with lots of non-empty bitmap, the algorithm should quickly return the empty set, without doing much merging.

Oppen commented 3 years ago

I think something like this should work (I wrote a dummy version but right now it crashes):

Allocate a slice for indices for each bitmap.
(Optionally) Sort from fewer to more keys. This would make it O(Blog(B)+N), but may exit faster.
For each key key1 in the first bitmap bm1:
- For each bitmap bmi other than the first:
  - If no more keys are available for bmi, then we exhausted the input and need to return whatever we aggregated already,.
  - For each key keyi:
    - If keyi < key1 skip the key, increasing the index for bmi, then go again.
    - If keyi == key1 move on to the next bmi.
    - Otherwise, key1 will be empty after the and, so just skip it and try with the next key from bm1.
- We finished the bmi loop without breaking from it. This means we found a key that's defined in all input bitmaps. Create a slice of containers for the current indices, pass it to the input channel. Go ahead with the next key1 from bm1.

Oppen commented 3 years ago

A more sophisticated version would use shotgun search for the current key, as well as skipping to the maximum found from the last iteration. But that's a reasonable draft I think.

Oppen commented 3 years ago

It works but doesn't change a bit 🤷 However, using a pool for the containers slice does decrease allocations and memory use by 7-8% in the benchmarks, without an execution time penalty.

lemire commented 3 years ago

There should never been any need for sorting.

You should never have to visit a key more than once.

Oppen commented 3 years ago

AFAIK you don't visit keys more than once with this approach. The sorting is just to exit before going through them all, but I don't think it makes much of a difference.

Oppen commented 3 years ago

I have an implementation that passes tests. However, I don't think it's the best code in terms of readability and I haven't written the benchmarks yet (there's none for FastAnd). I'll make a draft PR later today, but as of now it shouldn't be merged.

EDIT: I just revived my branch for doing the same for FastAnd and forgot the original one was for the ParAnd function. I can only guess that I wanted to see where the difference comes from. But I think where this will matter most will be when bitmaps have many defined keys where the log part is significant, I'm not sure if our benchmarks cover that case.

Oppen commented 3 years ago

I think it's related enough to mention it tho. This implementation uses the shotgun search approach to skip irrelevant containers.