Closed markaren closed 2 years ago
Might need to fallback on a thread pool for older gcc versions (if we still want to support old versions)..
And the docker image needs libtbb-dev
This looks very interesting, and I'll give it a proper review as soon as I have the time (not today, unfortunately).
I do think the claim of a performance boost needs to be a bit better substantiated. What kinds of performance testing has been done? How many different system configurations; how many slaves in each; were they performing lightweight or heavyweight computations, or a combination thereof; etc.? (And yes, I understand that performance is just a part of the motivation here.)
The claims about performance needs more vetting for sure. I've only done minor non-scientifc tests so far, which do not actually compare the numeric results. So there could be bugs. However, the Knuckleboom case for instance runs 4x faster with this PR. Setting simulation to 5 sec for the test case and timing simulate_until
yields 43s for fibers and 10s for std:.for_each.
However, looking at the benchmarks from https://ntnuopen.ntnu.no/ntnu-xmlui/bitstream/handle/11250/2728069/1-s2.0-S1569190X20301726-main%2b%25281%2529.pdf?sequence=2&isAllowed=y, we know that libcosim is not as fast as it could be.
https://github.com/open-simulation-platform/libcosim/tree/parallel-pool uses a threadpool, so it should work with older compilers.
Supporting std::for_each
also means dropping gcc8
. Is that still fine?
Also I don't know how to update the docker image to include libtbb-dev
required by gcc9
<execution>
requires TBB (Intel Thread Building Blocks) 2018 on linux. Ubuntu 18.04 ships with 2017 version. Need to update to Ubuntu 20.04. I assume this is a showstopper for this PR, and we should use a thread pool ?
<execution>
requires TBB (Intel Thread Building Blocks) 2018 on linux. Ubuntu 18.04 ships with 2017 version. Need to update to Ubuntu 20.04. I assume this is a showstopper for this PR, and we should use a thread pool ?
Should be possible to change our workflows to make it compile, but maybe we shouldn't require this for any user to build it locally. The threadpool solution is just as good in my mind :)
This PR aims to remove Boost::fiber, refactoring the code to become sequential. Parallelism becomes an implementation detail of the algorithm used. For this PR, the fixed-step algorithm uses the parallel std::for_each algorithm. A bool flag could be added to make it not use paralellism.
The overall aims is to make the code less verbose, easier to understand and provide a perfomance boost.
There are some issues with file locking that I have not been able to resolve.
Let the discussion begin.