Replace fibers with threadpool - Githubissues

open-simulation-platform / libcosim

OSP C++ co-simulation library

https://open-simulation-platform.github.io/libcosim

Mozilla Public License 2.0

61 stars 10 forks source link

Replace fibers with threadpool #674

Closed markaren closed 2 years ago

markaren commented 3 years ago

This PR replaces #671. It uses a threadpool rather than std::for_each so that compilers older than gcc9 will work.

Note that slave state is unimplemented in this PR. It will eventually be included when I have figured out how to best make it back in after removing async_slave. Additionally the concurrency test has been commented out. The test is about file locking.

markaren commented 2 years ago

It uses a threadpool rather than std::for_each so that compilers older than gcc9 will work

This is an important thing to address if this is something we want to move forward with. Do we have to support old compilers that are not C++17 feature complete?

markaren commented 2 years ago

If the target branch gets the ok, we should also make it possible to specify the count in the XML config.

Edit: This was meant as an comment in #680

markaren commented 2 years ago

Adding the functionality to set the number of worker threads for SSP was trivial, but not so much for the OSP alternative as it does not provide algorithm spesific configuration options. This is related to #404

ljamt commented 2 years ago

Adding the functionality to set the number of worker threads for SSP was trivial, but not so much for the OSP alternative as it does not provide algorithm spesific configuration options. This is related to #404

I see your point, but I don't think this can be solved separately. Should not be a blocker for this PR.

ljamt commented 2 years ago

What's the reason for this PR to remain in draft mode? Any remaining issues that must be resolved?

markaren commented 2 years ago

What's the reason for this PR to remain in draft mode? Any remaining issues that must be resolved?

Not other than significant vetting.

And yeah, the concurrent file locking test needs to be fixed (currently commented out).

ljamt commented 2 years ago

utility_concurrency_unittest is still commented out. Should be included and fixed before merging.

ljamt commented 2 years ago

utility_concurrency_unittest is still commented out. Should be included and fixed before merging.

Test included and passing now

ljamt commented 2 years ago

As fibers are removed, I think cosim::utility::shared_mutex can be replaced with std::shared_mutex. Don't want to add more to this PR, and can push that as a separate PR.

ljamt commented 2 years ago

@markaren, if you are ok with the latest changes to thread_pool.hpp are we then ready to merge? @kyllingstad, @eidekrist, share your opinions if you disagree :)

markaren commented 2 years ago

How are the observed differences in usage/speed/accuracy on your side? All good? We can probably set the the default number of threads to std::thread::hardware_concurrency()-1 in fixed_step_algorithm as suggested?

restenb commented 2 years ago

How are the observed differences in usage/speed/accuracy on your side? All good? We can probably set the the default number of threads to std::thread::hardware_concurrency()-1 in fixed_step_algorithm as suggested?

Yes. I've added a unsigned int max_threads_ = std::thread::hardware_concurrency() - 1 variable to fixed_step_algorithm. With what is now a blocking only strategy, it may no longer be necessary, but I'd like to extend this in the future to include a spinlock as well. Blocking & resuming threads has a non-negligible overhead cost when done at a higher rate, such as with a very small timestep simulation.

We seem to be seeing ~15-20% improvements in simulation speed with this PR over the fiber implementation, at least with the example projects like dp-ship.

kyllingstad commented 2 years ago

One more thing: You may want to consider running clang-format on everything before merging. I see there are some includes that are out of alphabetic order after the async_slave --> slave change, and possibly other things. If it's not fixed now, it's going to show up in someone else's PR later.