ENH: Parallel mode for monte-carlo simulations

brunosorban commented 3 months ago

This pull request implements the option to run simulations in parallel to the MonteCarlo class. The feature is using a context manager named MonteCarloManager to centralize all workers and shared objects, ensuring proper termination of the sub-processes.

A second feature is the possibility to export (close to) all simulation inputs and outputs to an .h5 file. The file can be visualized via HDF View (or similar) software. Since it's a not so conventional file, method to read and a structure to post-process multiple simulations was also added under rocketpy/stochastic/post_processing. There's a cache handling the data manipulation where a 3D numpy array is returned with all simulations, the shape corresponds to (simulation_index, time_index, column). column is reserved for vector data, where x,y and z, for example, may be available under the same data. For example, under cache.read_inputs('motors/thrust_source') time and thrust will be found.

Pull request type

[x] Code changes (bugfix, features)

Checklist

[ ] Tests for the changes have been added (if needed)
[x] Docs have been reviewed and added / updated
[ ] Lint (black rocketpy/ tests/) has passed locally
[ ] All tests (pytest tests -m slow --runslow) have passed locally
[ ] CHANGELOG.md has been updated (if relevant)

Current behavior

In the current moment, montecarlo simulations must run in parallel and all outputs a txt file

New behavior

The montecarlo simulations may now be executed in parallel and all outputs may be exported to a txt or an h5 file, saving some key data or everything.

Breaking change

[ ] Yes
[x] No

Additional information

None

brunosorban commented 3 months ago

Benchmark of the results. A machine with 6 cores(12 threads) was used.

workers_performance

Gui-FernandesBR commented 3 months ago

Amazing feature, as the results show the MonteCarlo class has great potential for parallelization.

The only blocking issue I see with this PR is the serialization code. It still does not support all of rocketpy features and requires a lot of maintanance and updates on our end.

Do you see any other option for performing the serialization of inputs?

@phmbressan we should make all the classes json serializable, it's an open issue at #522 . In the meantime, maybe we could still use the _encoders module to serialize inputs.

I agree with you that implementing flight class serialization within this PR may conflict create maintenance issues for us. The simplest solution would be to delete the flightv1_serializer (and similar) function.

codecov[bot] commented 2 months ago

Codecov Report

Attention: Patch coverage is 30.51643% with 148 lines in your changes missing coverage. Please review.

Project coverage is 75.23%. Comparing base (aa0673a) to head (e40a871). Report is 12 commits behind head on develop.

Files	Patch %	Lines
rocketpy/simulation/monte_carlo.py	25.75%	147 Missing :warning:
rocketpy/tools.py	66.66%	1 Missing :warning:

Additional details and impacted files

```diff @@ Coverage Diff @@ ## develop #619 +/- ## =========================================== - Coverage 75.75% 75.23% -0.53% =========================================== Files 81 85 +4 Lines 9820 10203 +383 =========================================== + Hits 7439 7676 +237 - Misses 2381 2527 +146 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

phmbressan commented 1 month ago

The monte_carlo_class_usage notebook currently does not work with parallel, I did not have time to look into it, and so I did not review the parallel part of the code

I know your review was just temporary, but could you be a bit more specific on the parallel side not working? It might be an OS related issue that we should fix of course, but here things were working fine.

MateusStano commented 1 month ago

I know your review was just temporary, but could you be a bit more specific on the parallel side not working? It might be an OS related issue that we should fix of course, but here things were working fine.

Open the monte_carlo_class_usage.ipynb and run all cells.

The parameter parallel is set to True, so the simulation runs in parallel.

After the sim is done, nothing is saved to the .inputs.txt or .outputs.txt files

If you set parallel to False instead, the results are saved correctly

phmbressan commented 1 month ago

I have pushed a fix for the issue on file writing when running on Windows (more accurately on processes spawn mode). I have tested it on a Windows machine and it was running correctly, but I invite reviewers to test also in different OS configs.

Issues solved by this PR:

[X] MonteCarlo simulations have a parallel mode;
[X] Both the simulation execution and data saving are executed in parallel (producer - consumer);
[X] There are performance gains on large simulations;
[X] The serial simulations can be executed in the same fasion and the outputs of both ways are compatible.

Points of Improvement:

[ ] Soft Interrupts of parallel simulations (e.g. an exception or Ctrl-C) are only effective on Linux. Spawned processes (Windows) currently are hard stopping.
[ ] On Windows, the Jupyter notebook will not show the status update prints (running the simulations in a terminal is fine). This seems to be a OS level std output change that is not easily solved.

Some of these points could become issues of the repository. Stating them here for proper PR documentation.

Future Considerations:

Python 3.14 and forward will make the spawn the default start method for all OS. We could change RocketPy start method stay as fork on Linux if this undermines too much the performance;
The Python GIL should be removed some years from now (PEP703), this could bring performance benefits, since Threads are generally faster to start.

Gui-FernandesBR commented 1 month ago

@phmbressan I like the way this PR was refactored. Many thanks for your effort.

Please fix the pylint errors and solve all the open conversations in this PR so we can approve and merge it onto develop!

Optionally, try to rebase the PR to get the latest commits from develop.

Gui-FernandesBR commented 3 weeks ago

Converted to draft until you solve the remaining issues, specially the random number generation problem, @phmbressan

RocketPy-Team / RocketPy