esa / pygmo2

A Python platform to perform parallel computations of optimisation tasks (global and local) via the asynchronous generalized island model.
https://esa.github.io/pygmo2/
Mozilla Public License 2.0
422 stars 57 forks source link

Dask Integration [FEATURE] #101

Open franciscblubaugh opened 2 years ago

franciscblubaugh commented 2 years ago

I recently came across this library in a technical talk. I frequently use the Dask parallel processing engine to scale my work across multiple machines. Is there any plans to expand the multiprocessing tasking to leverage something like Dask or MPI for cluster based optimization?

flemmel commented 1 year ago

Hi @franciscblubaugh,

We are successfully using Dask with Pygmo2 in our project Pyxel (https://gitlab.com/esa/pyxel and https://esa.gitlab.io/pyxel/). It works well in a single computer and a grid of computers (84 cores)

We have developed our own user-defined BFE (Batch Fitness Evaluator) and user-defined Island using Dask.

Since January our project is open-source (MIT License), you can find these user-defined BFE and Island here https://gitlab.com/esa/pyxel/-/blob/master/pyxel/calibration/user_defined.py

This code could/should be integrated in Pygmo2.

What do Pygmo contributor think ?

bluescarni commented 1 year ago

What do Pygmo contributor think ?

We would certainly welcome PRs in this sense :)

flemmel commented 1 year ago

Nice !

I will create a Pull Request !

IvoSteiner commented 7 months ago

I am successfully using Pygmo2 on my local PC (single machine) and am very satisfied with the parallel optimization performance. As part of my master’s thesis, I intend to conduct parallel optimizations using Pygmo2 on the university's HPC (multi machine). I have explored the Dask integration / extension in Pygmo2 as described in Pyxel. I have a general understanding of the process, but I still have various difficulties with the implementation in my code.

@flemmel and @bluescarni

@bluescarni I am a beginner regarding parallelizing Python code. Based on the Pygmo2 capabilities description, I assumed that the library already runs natively on HPCs (Multi Machine).

Thanks for your help!

bluescarni commented 7 months ago
* An _official_ Dask integration in Pygmo2 would be highly desirable. Is it still planned?

No concrete plans at the moment.

* Do you have basic Dask Pygmo2 _integration / extension_ examples other than Pyxel.py itself?

Dask integration would mean implementing a user-defined island that distributes the evolutions via Dask. We have several user-defined islands implemented in pygmo already:

https://github.com/esa/pygmo2/blob/master/pygmo/_py_islands.py

See also the island documentation for information on the API that a user-defined island needs to implement:

https://esa.github.io/pygmo2/island.html

* Are there approaches with less overhead than Dask to execute Pygmo2 on HPCs?
* How do you work with Pygmo2 on an HPC? Any simple examples are appreciated.

We have an ipyparallel island which can be used on HPC setups:

https://esa.github.io/pygmo2/islands.html#pygmo.ipyparallel_island

We don't have however much experience/user feedback regarding HPC deployments...

IvoSteiner commented 7 months ago

Thank you for the prompt response. I will take a closer look at the concepts you mentioned. I will reach out again if I have any new insights regarding the HPC deployment. However, unfortunately, it no longer has the highest priority in my thesis.

erl987 commented 4 hours ago

I want to upvote a Dask integration as a built-in island. It does not seem too hard to try the approaches mentioned in this issue.

But the competing frameworks jmetalpy and pymmo provide Dask support out of the box. For distributed computing a clear advantage.