Open kabergstrom opened 4 years ago
I just tried using rpmalloc
, and while performance improves quite a bit, the shipyard allocate benchmark crashes with "memory allocation of 2424 bytes failed".
Perhaps @leudz could check why this is?
I've narrowed it down to:
rayon::ThreadPoolBuilder::new().build().unwrap();
It triggers this assert sometimes.
I suppose the issue is that the benchmark creates a new World in the bench function, and in shipyard's case, creating a new World will create a new threadpool which immediately spawns threads. rpmalloc allocs heaps per-thread too, and I suppose this intense thread creation pressure is causing a OOM condition since Windows doesn't overcommit.
@leudz Do you have any opinion on how to solve this?
For non parallel benchmarks removing the parallel
feature would work.
Maybe using a custom pool could solve the problem, I'm not sure.
Shipyard now uses the global ThreadPool
, problem solved =)
When running the benchmarks, I got more than an 80% improvement (10ms to 2ms) when I switched from platform-provided malloc to rpmalloc on windows for the serialize_binary benchmark. Additionally, after switching, some other cases had up to 36% difference in runtime. I think a good avenue to explore for extending the benches would be to measure # and size of allocations for the test cases, and to warn people about platform-provided malloc. Maybe you should mandate a custom allocator that is cross-platform to avoid people benchmarking the wrong thing.
Additionally, I would advise pre-allocating serialization buffers to ensure it's not just a bench of
Vec::grow
.