Closed markaren closed 2 years ago
It uses a threadpool rather than std::for_each so that compilers older than gcc9 will work
This is an important thing to address if this is something we want to move forward with. Do we have to support old compilers that are not C++17 feature complete?
If the target branch gets the ok, we should also make it possible to specify the count in the XML config.
Edit: This was meant as an comment in #680
Adding the functionality to set the number of worker threads for SSP was trivial, but not so much for the OSP alternative as it does not provide algorithm spesific configuration options. This is related to #404
Adding the functionality to set the number of worker threads for SSP was trivial, but not so much for the OSP alternative as it does not provide algorithm spesific configuration options. This is related to #404
I see your point, but I don't think this can be solved separately. Should not be a blocker for this PR.
What's the reason for this PR to remain in draft mode? Any remaining issues that must be resolved?
What's the reason for this PR to remain in draft mode? Any remaining issues that must be resolved?
Not other than significant vetting.
And yeah, the concurrent file locking test needs to be fixed (currently commented out).
utility_concurrency_unittest
is still commented out. Should be included and fixed before merging.
utility_concurrency_unittest
is still commented out. Should be included and fixed before merging.
Test included and passing now
As fibers are removed, I think cosim::utility::shared_mutex
can be replaced with std::shared_mutex
. Don't want to add more to this PR, and can push that as a separate PR.
@markaren, if you are ok with the latest changes to thread_pool.hpp
are we then ready to merge?
@kyllingstad, @eidekrist, share your opinions if you disagree :)
How are the observed differences in usage/speed/accuracy on your side? All good?
We can probably set the the default number of threads to std::thread::hardware_concurrency()-1
in fixed_step_algorithm
as suggested?
How are the observed differences in usage/speed/accuracy on your side? All good? We can probably set the the default number of threads to
std::thread::hardware_concurrency()-1
infixed_step_algorithm
as suggested?
Yes. I've added a unsigned int max_threads_ = std::thread::hardware_concurrency() - 1
variable to fixed_step_algorithm
. With what is now a blocking only strategy, it may no longer be necessary, but I'd like to extend this in the future to include a spinlock as well. Blocking & resuming threads has a non-negligible overhead cost when done at a higher rate, such as with a very small timestep simulation.
We seem to be seeing ~15-20% improvements in simulation speed with this PR over the fiber implementation, at least with the example projects like dp-ship.
One more thing: You may want to consider running clang-format on everything before merging. I see there are some includes that are out of alphabetic order after the async_slave
--> slave
change, and possibly other things. If it's not fixed now, it's going to show up in someone else's PR later.
This PR replaces #671. It uses a threadpool rather than
std::for_each
so that compilers older than gcc9 will work.Note that slave state is unimplemented in this PR. It will eventually be included when I have figured out how to best make it back in after removing
async_slave
. Additionally theconcurrency
test has been commented out. The test is about file locking.