Open saik0 opened 2 years ago
@Kerollmops
Thinking out loud: A major version bump would be a good time to bump rust edition to 2021
Edit: Plus, we've already bumped MSRV to min supported ver for 2021 edition
Here are my benchmark comparisons for v0.8.1 to 49455d44db4d4dd4ab09cdda2ec0385399292ce7
Intel(R) Core(TM) i7-8700K CPU @ 3.70GHz. Hyperthreading and frequency scaling disabled. Base Arch Linux install. Linux 5.16.5-arch1-1
On rust stable (scalar only)
from_sorted_iter
and append
are exponentially faster[1] Depends on the dataset. Can be as little as zero
deserialize_into
is between 60-400% slower with data validation. We are planning on calling this out. container::len
used to be field access, now it must branch on store type. This is a tradeoff we made to make cardinality tracking while flipping bits on bitmaps, which is part of why all the set ops are faster. However it has led to had some unintended downstream effects.
RoaringBitmap::len
is slower by about 60%.RoaringBitmap::serialize_into
is about 10-20% slowerDifference between old deserialize_into
and new deserialize_into_unvalidated
is within noise threshold. This is not absent from the data, as previously there was no deserialize_into_unvalidated
benchmarks that could be compared automatically. I had to validate this manually.
is_disjoint
being ~15% slower appears to have just been noise. I wasn't able to reproduce this over multiple runs
Analysis for SIMD coming later, but the data is in the zip if you want to peek
@Kerollmops Time to get rid of union_with()
(and others) deprecated in 0.6.7? (April 2021)
Time to get rid of
union_with()
(and others) deprecated in 0.6.7? (April 2021)
You are right, I created issue #200 about that thank you for the reminder.
Here are my benchmark comparisons for v0.8.1 to https://github.com/RoaringBitmap/roaring-rs/commit/49455d44db4d4dd4ab09cdda2ec0385399292ce7
I am so much impressed by what you've done and it is only the benchmarks for x64_86 🎉
About the scalar version, you are right: the perf losses aren't much important as they are under the millisecond! About the SIMD version, that's impressive too! I was sure your work would break the law of physics 🚄 🚀
Once we release the new version of roaring I will create a GitHub release to talk about the work you have done, featuring Meilisearch just for its fame and the indirect help you bring to this project. We are so much awaiting this release of Roaring on the Meilisearch side, just to see those flamegraphs melt 🧊
Meilisearch and other projects intensively using roaring would also gain speed by using the new multi-ops API, however, it can be released in a future version as it will not be breaking!
I am so much impressed by what you've done and it is only the benchmarks for x64_86 🎉
Thanks. I'm curious to see how the array_array SIMD does on other platforms
We are so much awaiting this release of Roaring on the Meilisearch side, just to see those flamegraphs melt
😂
I'm pleased to contribute to OSS again, it's been too long.
Meilisearch and other projects intensively using roaring would also gain speed by using the new multi-ops API, however, it can be released in a future version as it will not be breaking!
That reminds me. I'm going to create a planning ticket for next_version++
@Kerollmops I kicked out #204. It can wait.
Do you think I should try to do a quick fix for #191 since the theme of this release is: 🚀
Do you think I should try to do a quick fix for #191 since the theme of this release is: 🚀
You can indeed if you have already a good idea of how you would speed this up!
You can indeed if you have already a good idea of how you would speed this up!
Just get rid of the in place operation and force an allocation like we have with union.
@Kerollmops Release notes contains perf from unmerged xor change. AFAICT master
is ready to ship as 0.9
once merged
@Kerollmops Updated release notes for #187. Are we ready to bump the version and release?
Yeah, I think we are ready to do so now. You are right that even if there are no breaking changes we should bump the version as we drastically impacted the performances of deserialize_from
. I will do this either today or tomorrow.
Thank you for reminding me!
Hum... @saik0,
I have a small issue when I try to publish the crate on crates.io: it seems that we can't publish a crate that depends on a git repository and that this dependency doesn't define the version of that crate that has been published to crates.io.
However, core_simd
, doesn't seem to be available on crates.io.
$ cargo publish --dry-run
Updating crates.io index
error: all dependencies must have a version specified when publishing.
dependency `core_simd` does not specify a version
Note: The published dependency will use the version from crates.io,
the `git` specification will be removed from the dependency declaration.
Here's what I think is still left before next release
Release notes
Breaking changes
union_with
. Use the corresponding operators|=
.deserialize_from
validates inputs. It some cases it can be 4x slower. For workloads that are heavy in deserialization from trusted sources migrate todeserialize_unchecked_from
Performance optimizations
from_sorted_iter
andappend
are exponentially faster. They should be preferred overcollect
andextend
whenever adding monotonically increasing integers to the set as it's about 2-2.5x faster.New features
rank
Returns the number of integers that are <= value. rank(u64::MAX) == len()select
Returns then
th integer in the set orNone
ifn <= len()
union_len
intersection_len
... and so on. Compute the cardinality of a set operation without materializing it. Usually about twice as fast as materializing the bitmap.DoubleEndedIterator
ExactSizeIterator
(on 64 bit targets only)Other
Perf numbers are for pairwise operations on collections of bitmaps from real datasets. Pairwise as in: B1 ∩ B2, B2 ∩ B3, ... BN-1 ∩ BN