RoaringBitmap / CRoaring

Roaring bitmaps in C (and C++), with SIMD (AVX2, AVX-512 and NEON) optimizations: used by Apache Doris, ClickHouse, and StarRocks
http://roaringbitmap.org/
Other
1.56k stars 266 forks source link

fastunion vs |= in a loop #417

Open maxdemarzi opened 1 year ago

maxdemarzi commented 1 year ago

I have a std::vector of Roaring64Maps

std::vector<roaring::Roaring64Map> results = // some process

When I use fastunion (probably incorrectly):

    auto sharded_results = new const roaring::Roaring64Map*[results.size()];
    for (size_t i = 0; i < results.size(); ++i) {
        sharded_results[i] = &results.data()[i];
    }

    roaring::Roaring64Map combined = roaring::Roaring64Map::fastunion(results.size(), sharded_results);

It takes almost 5 seconds to combine them. However when I just do:

    roaring::Roaring64Map combined;
      for(const auto& sharded : results) {
        combined |= sharded;
    } 

It takes 2.8 seconds. In this case I only have 4 maps, and the combined carnality ends up just above 3 million. Am I doing something wrong or is this expected?

lemire commented 1 year ago

It is not the case that the approach you take (two-by-two ORs) is necessarily inferior. More importantly, we don't currently have solid benchmarks for fastunion. That is something that we should add.

https://github.com/RoaringBitmap/CRoaring/blob/a644f44898d3c0f27af4406827bbde5f42a06b53/cpp/roaring64map.hh#LL1381