Open dalbabur opened 8 months ago
Hi @dalbabur
When you run an evolution on an mp_island
, the algorithm state (which includes the mybfe
object set via a.set_bfe(mybfe)
) needs to be serialised and transmitted to the remote process spawned by mp_island
. The algorithm will be deserialised in the remote process and evolution can then start.
My guess would be that during (de)serialisation of the mybfe
object, the information about the custom setup of the ipyparallel view is not kept.
Is there any specific reason why you want to mix process-based serialisation with ipyparallel?
Oh I see, that makes sense. Then, theoretically, it would be possible by modifying __setstate__
and __getstate__
, no?
The original reason I was interested in mixing process-based serialisation with ipyparallel was to offload part of the work to a local machine, while doing the more expensive fitness evaluations in a remote cluster. I've tried two ipyparallel clusters, and that works fine, but was wondering about other options, as increasing the number of ipyparallel nodes really slows things down.
@dalbabur
After taking a look at the code, I realised my explanation was partly incorrect. Indeed, what is happening is that pygmo manages a global instance of the ipyparallel_view
which is implicitly created on first usage of any ipyparallel-related functionality. So there is actually no ipyparallel_view
stored in the mybfe
object and nothing gets serialised and transmitted to the remote process.
What is happening instead is that the remote process has its own ipyparallel_view
global object, which is created on-demand if and when it is used. The creation of the remote ipyparallel_view
is done with default settings, and your custom options client_kwargs={'profile':'my-profile'}
are not being used.
I see two possible solutions, both of which I think involve some modifications in the pygmo source code:
1) either we give the user the possibility to execute some custom setup code whenever a new mp_island
is created (so that you could execute init_view(client_kwargs={'profile':'my-profile'})
in this custom code snippet), or
2) we change ipyparallel_bfe()
so that each instance contains its own view and make sure that is it properly (de)serialised when it is pickled.
Personally I would prefer number 2). I initially wrote ipyparallel_bfe
and ipyparallel_island
to use a global ipyparallel_view
because I was wary of potential performance issues when using multiple views, but in hindsight it was probably a premature optimisation mistake.
To be honest, we never got much user feedback for the ipyparallel
-related functionality so it was never tweaked/touched/improved after the initial implementation.
If you have some familiarity with ipyparallel
and would like to contribute to pygmo, PRs would be welcome :)
The relevant code would be here:
https://github.com/esa/pygmo2/blob/master/pygmo/_py_bfes.py#L321
I don't think it would be too much work.
Describe the bug I'm passing a UDBFE with an initialized ipyparallel view to the island constructor. When initializing the population, it correctly uses the UDBFE. However, when evolving the island, it tries to create a new BFE with the default ipyparallel cluster!
To Reproduce
Define UDI and UDBFE
Start islands. Evaluation does happen with the correct UDBFE
Error when evolving the islands: