esa / pygmo2

A Python platform to perform parallel computations of optimisation tasks (global and local) via the asynchronous generalized island model.
https://esa.github.io/pygmo2/
Mozilla Public License 2.0
429 stars 57 forks source link

Archipelago slices #5

Open CoolRunning opened 4 years ago

CoolRunning commented 4 years ago

Currently, an archipelago object allows to access its islands by indexing:

archi[1] Island name: Multiprocessing island Status: idle ...

Looping like for isl in archi is supported as well. However, accessing islands via slicing throws an error:

for isl in archi[0:3]: pass

ArgumentError Traceback (most recent call last)

in ----> 1 for isl in archi[0:3]: 2 pass ArgumentError: Python argument types in archipelago.__getitem__(archipelago, slice) did not match C++ signature: __getitem__(pagmo::archipelago {lvalue}, unsigned long)
bluescarni commented 4 years ago

It's a good suggestion. Not sure if we should tackle this now however or if it is better to wait for the migration to pybind11, I am a bit wary about adding feature now that we will anyway need to re-write at a later stage.

bluescarni commented 4 years ago

@CoolRunning now that we moved to pybind11, I have been thinking about this feature.

What exactly would you expect archi[0:3] to return? A new archipelago containing a copy of the first 3 islands of archi? That would work but it would be misleading I think, since if you use the sliced archi with modifying operations on the islands, then you will be modifying copies of the islands rather than the original islands.

I think in principle we could perhaps return a list of references to the first 3 islands, but I am not sure it is possible to tie the lifetime of the island references in the list to the lifetime of the original archipelago (which is essential in order to avoid runtime crashes if archi is deleted while the sliced island list is still alive).

I'll try to investigate a bit more.

CoolRunning commented 4 years ago

The use case that I see here is indeed modification of the islands. For example, one could think about re-initializing some of them like:

for isl in archi[:10]:
    isl.set_population(pg.population(prob, 100))

Right now this is not possible and I would write something like

for i in range(10):
    archi[i].set_population(pg.population(prob, 100))

which feels less pythonic and becomes a bit bothersome.

With regards to the lifetime: don't we have this problem already?

isl = archi[0]
del archi
isl.get_population() # still works
bluescarni commented 4 years ago

With regards to the lifetime: don't we have this problem already?

isl = archi[0]
del archi
isl.get_population() # still works

There's some special sauce in the bindings code to make sure that this works: when you write isl = archi[n], the lifetime of isl becomes tied to the lifetime of archi, meaning that archi cannot be garbage-collected as long as isl is alive. It's unclear to me if/how this mechanism can be extended to the slice case, but perhaps if we move the implementation of the slicing protocol from C++ to Python we can take advantage of the existing life-extension mechanism without doing anything new on the C++ side.

I am still not 100% sure that archi[:10] returning a list of islands rather than an archipelago is fully correct, since perhaps it breaks common assumptions about slicing (i.e., if you slice a numpy array or a list, you still get a numpy array or a list in output).

CoolRunning commented 4 years ago

I am still not 100% sure that archi[:10] returning a list of islands rather than an archipelago is fully correct, since perhaps it breaks common assumptions about slicing (i.e., if you slice a numpy array or a list, you still get a numpy array or a list in output).

That is a fair point. I am fine with returning an archipelago, though: it is more the looping I would like to simplify. Having said that: sub-archipelagos could be interesting to have as well, but probably a more difficult feature?

In particular for heterogeneous archipelagos it would make sense to sort of define a view on an archipelago and only interact with part of it.

sub_archi1, sub_archi2 = archi[:k], archi[k:]
sub_archi1.evolve(10)
sub_archi2.evolve(100)
...
bluescarni commented 4 years ago

The whole pagmo model on the C++ side is implemented with value semantics, so you cannot really share anything the way you can share/duplicate a reference in the Python sense.

What would be perhaps doable (and strictly on the Python side) would be a sub-archipelago class that is a thin wrapper around a list of islands and which expose an API kind-of similar to that of an archipelago. It would be a bit of a pain to maintain, since you would need to duplicate some archipelago code in the sub-archipelago class, and I am not sure the effort would be worth the benefits. Perhaps we could start with the slicing mechanism returning a list of islands and then see how it goes from there.

CoolRunning commented 4 years ago

Getting list of islands would already be helpful, because as it stands now, one has to index each island directly or loop over the whole archipelago if one is interested at a specific part of it.

Useful for a migration study (given a specific topology) would be for example:

sending_islands = archi[:k]
receiving_islands = archi[k:]

for isl in sending_islands:
    isl.set_s_policy(...)

for isl in receiving_islands:
    isl.set_r_policy(...)

While those setters still don't exist and constitute probably another issue...