Open benvanik opened 4 years ago
This could be broken down into two issues:
I definitely agree with 1, and could be quickly fixed by implementing a std::allocator
that wraps the marl::Allocator
. I've created #131 to track this. I'd like to get to the point that we can run all the unit tests and examples without a single heap allocation that wasn't performed by the marl::Allocator
.
As for 2, I don't disagree, but I've been using various profilers and benchmarks for guidance here. The slow replacement of std
containers with those in marl/containers.h
have all been done when they appeared hot in profiles. I'd suspect that you'll find other, more significant optimisation opportunities before you find these to be the bottleneck. If you can show they are a major performance sponge, I'm more than happy to try replacing them.
For 2 I was mainly just going off the TODO in the code and have no empirical results yet to show why anything different would be worth it besides the "doing zero work and not thrashing cache is always cheaper than doing any work" :)
Part of this would probably be adding more test cases to scheduler_bench.cpp so that we can stress those structures a bit more (in terms of having a large number of waiters, etc).
From the TODO here: https://github.com/google/marl/blob/0591f0ef0811eff9e87c22048abc048c0399862d/include/marl/scheduler.h#L283-L287
and this: https://github.com/google/marl/blob/0591f0ef0811eff9e87c22048abc048c0399862d/include/marl/scheduler.h#L436
unordered_map and deque aren't great for memory locality and heap fragmentation, and currently these are not using the marl Allocator so they are coming right from the system allocator. It'd be good to evaluate some alternative data structures (adding to or reusing those already in https://github.com/google/marl/blob/master/include/marl/containers.h) and pooling.