AureumChaos / LEAP

A general purpose Library for Evolutionary Algorithms in Python.
Academic Free License v3.0
83 stars 19 forks source link

Improve interface for steady-state insertion operators #118

Open SigmaX opened 3 years ago

SigmaX commented 3 years ago

We've got nice pluggable operators for generational evolution, but our support for steady-state mechanisms is a little rough around the edges.

Specifically, steady-state algorithms make use of "insertion" operators. These typically

  1. Receive a new individual
  2. Choose an individual in the population for replacement
  3. Optionally compete the new individual against the replacement and keep the winner.

For step (2), it would be really nice if we could use the same selection operators we use for everything else (ex. ops.tournament_selection, ops.random_selection, etc.). But these operators return individuals, and for step (3) we need to know the index of the selected individual so we can replace it with the new one.

We've handled this two different ways so far:

It would be nice to have the best of both worlds.

One solution I can think of it having selection operators take an optional argument of a mutable list that they could populate with the indices of the selected individuals. That's a little ugly, but would give us a way to get index information out of them when we need it, but ignore it when we don't.

SigmaX commented 3 years ago

Thought: Steady-state evolution is an operator.

What makes steady-state different from generational is that it separates the notion of "updating a population" from "running individuals through a reproductive pipeline." You have this reproductive process kind of out to the side, and it reaches in to pull and insert individuals into this persistent population over on the other side.

That's why we currently have separate metaheuristics for generation_ea and steady-state EAs.

But, since steady-state reproduction is so different, why don't we think of it as separate from the population pipeline? Or as its own sub-pipeline? We could just plug a "steady-state operator" into a generational pipeline something like this:

    ea = generational_ea(generations=100,pop_size=pop_size,
                             problem=problem,
                             representation=...,

                             # Operator pipeline
                             pipeline=[
                                 ops.steady_state_step(
                                     reproduction_pipeline=[
                                         ops.tournament_selection,
                                         ops.clone,
                                         mutate_binomial(std=1.5, bounds=[problem.bounds]*l,
                                                expected_num_mutations=1)
                                     ]),
                                     contestant_selector=ops.random_selection,
                                     evaluator=...
                                 ),
                                 # Some visualization probes so we can watch what happens
                                 probe.PlotTrajectoryProbe(xlim=problem.bounds, ylim=problem.bounds,
                                        contours=problem),
                                 probe.PopulationPlotProbe(),
                                 probe.FitnessStatsCSVProbe(stream=sys.stdout)
                             ]
                        )

This gives us a loosely-coupled view of steady-state EAs as a kind of generational EA that has a nested operator. The operator takes a population, does something to it, and returns the modified population.

One advantage is that it's easy to toss population-level probes onto the end of the pipeline.

The downside is it's a bit tricky to introduce people to: having a reproduction_pipelien inside another pipeline operator may be confusing and steepen the learning curve.

markcoletti commented 3 years ago

How does asynchrony fit into this model?

On Sat, May 1, 2021 at 7:55 PM SigmaX @.***> wrote:

Thought: Steady-state evolution is an operator.

What makes steady-state different from generational is that it separates the notion of "updating a population" from "running individuals through a reproductive pipeline." You have this reproductive process kind of out to the side, and it reaches in to pull and insert individuals into this persistent population over on the other side.

That's why we currently have separate metaheuristics for generation_ea and steady-state EAs.

But, since steady-state reproduction is so different, why don't we think of it as separate from the population pipeline? Or as its own sub-pipeline? We could just plug a "steady-state operator" into a generational pipeline something like this:

ea = generational_ea(generations=100,pop_size=pop_size,
                         problem=problem,
                         representation=...,

                         # Operator pipeline
                         pipeline=[
                             ops.steady_state_step(
                                 reproduction_pipeline=[
                                     ops.tournament_selection,
                                     ops.clone,
                                     mutate_binomial(std=1.5, bounds=[problem.bounds]*l,
                                            expected_num_mutations=1)
                                 ]),
                                 contestant_selector=ops.random_selection,
                                 evaluator=...
                             ),
                             # Some visualization probes so we can watch what happens
                             probe.PlotTrajectoryProbe(xlim=problem.bounds, ylim=problem.bounds,
                                    contours=problem),
                             probe.PopulationPlotProbe(),
                             probe.FitnessStatsCSVProbe(stream=sys.stdout)
                         ]
                    )

This gives us a loosely-coupled view of steady-state EAs as a kind of generational EA that has a nested operator. The operator takes a population, does something to it, and returns the modified population.

One advantage is that it's easy to toss population-level probes onto the end of the pipeline.

The downside is it's a bit tricky to introduce people to: having a reproduction_pipelien inside another pipeline operator may be confusing and steepen the learning curve.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/AureumChaos/LEAP/issues/118#issuecomment-830711163, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABZHWSCQBSKEOLG5KPSABYDTLSIIPANCNFSM4Z5CQYOA .

-- @.***

SigmaX commented 3 years ago

It ought to be straightforward: a parallelized version of the steady-state operator will generate a new individual at each step and send it off to Dask, but insert whatever individual completes next.

Setting the number of processors to zero (or using a non-parallel version of the operator) would restore traditional (non-asynchronous) steady-state behavior.

We could have other variations too: generation gaps of something other than 1 could be handled by a similar operator, or variations like the quasi-generational EA we studied a few years ago.

SigmaX commented 3 years ago

I think we've stabilized on a solution here:

This will still leave the asynchronous metaheuristic out, so we'll have a bit of code duplication that isn't unified into a single approach, but that's not a problem for this issue.