quaquel / EMAworkbench

workbench for performing exploratory modeling and analysis
BSD 3-Clause "New" or "Revised" License
128 stars 90 forks source link

Prototype of MPIEvaluator for multi-node workloads #292

Closed EwoutH closed 1 year ago

EwoutH commented 1 year ago

Very raw prototype of a multi-node evaluator that uses the MPIPoolExecutor from mpi4py.futures.

Ignore all the scripts, the interesting stuff is the new MPIEvaluator class in the evaluators.py file.

It currently is tested with pure Python models, on multiple nodes of the DelftBlue HPC system.

For todo and discussion, see: https://github.com/quaquel/EMAworkbench/discussions/266

coveralls commented 1 year ago

Coverage Status

coverage: 80.667% (-0.2%) from 80.893% when pulling 89817aef3b8e59b5fc7090ca0465a0bc7789765d on multi-node-development into b76b487afc5fbaaa34e5a055269657ff15fa908e on master.

EwoutH commented 1 year ago

@quaquel Finally fixed the propagation of logging levels in 89817ae. Could review the changes to the evaluators.py and ema_logging.py. Especially please especially check the set_root_logger_levels argument, and if you find it necessary at all (or if it can be default behaviour).

From my perspective, the function changes are done. So running on multiple nodes and logging are implemented successfully, but non-Python models is (might still work, but I didn't manage).

My current plan is to spend next week writing some documentation, examples, maybe a short tutorial and doing some quick performance measurements. Then, I will clean-up all commits on a new branch and open a PR.

EwoutH commented 1 year ago

Closing this PR, since it's succeeded by #299.

Future discussion about the new MPIEvaluator is welcomed at #311!