Closed michaelsproul closed 1 year ago
Some more benchmarks, this time on x86_64 Linux: https://gist.github.com/michaelsproul/4462d18ce7075479b03517349ffd06e2
Unfortunately it looks like we can't just Pareto-improve by switching to triomphe
. There are some substantial regressions in some of the x86_64 benchmarks, especially for RedBlackTreeMap
. I made a spreadsheet of results here: https://docs.google.com/spreadsheets/d/1WsuOhA2DoKUCfoEE-1OAFPjMMUgo4jFpiZx8RVbrpH0/edit
One strategy would be to switch the backend only where the gains are universal, i.e. for List
, Queue
and HashTrieMap
. Vector
could go either way, and RedBlackTreeMap
is a hard no IMO unless we gate by target-arch (or OS). I'm not entirely sure whether it's the ARM CPU or the OS making more of a difference, as I know the macOS allocator is substantially better with fragmentation than GNU malloc. In production I tend to use jemalloc
, and will try to re-run the Linux benchmarks with jemalloc
to see if this makes a difference.
My benchmarking process was:
git checkout master
cargo bench -- sync # baseline
git checkout triomphe
cargo bench -- sync # comparison
If you have any different hardware @orium it might also be good to see the results from that.
@michaelsproul Sorry, I just renamed the master
branch to main
. Can you open the PR again after rebasing on main
please?
Thanks, reopened here: https://github.com/orium/rpds/pull/88
Change the default
Arc
backend totriomphe
, which results in speedups of up to 40% (per https://github.com/orium/rpds/pull/85).The default backend is changed to
ArcTK
, while retaining the ability for the user to swap out toArcK
if they prefer.This PR does not currently include the ability to completely compile-out
triomphe
. If that were desired, we would have to expose a feature calledtriomphe
which controls the backend, and causesstd::sync::Arc
to be used when the feature is disabled.