morinim / vita

Vita - Genetic Programming Framework
Mozilla Public License 2.0
35 stars 6 forks source link

Parallelization #21

Closed impulsecorp closed 5 years ago

impulsecorp commented 5 years ago

What is the best way to run Vita multicore?

morinim commented 5 years ago

At the moment Vita doesn't support parallelization.

It seems to me that in a jungle of possibilities there are two interesting (i.e. effective and relatively simple) development opportunities:

  1. the island model parallelization;
  2. the parallelization of the ALPS evolutionary strategy.

Island model parallelization

Subpopulations can be situated at separate processing nodes. When a run begins, each processing node locally creates its own initial random subpopulation and starts the standard evolution process.

Upon completion of batch of generations, a certain number of the individuals in each subpopulation are probabilistically selected for emigration to a designated (e.g. adjacent) node within the topology (e.g. toroidal) of processing nodes.

When the immigrants arrive at their destination, a like number of individuals are selected for deletion from the destination processor. The inter-processor communication requirements of these migrations are low because only a small number of individuals migrate from one subpopulation to another and because the migration occurs only after completion of the time-consuming fitness evaluation.

There are, of course, numerous variations in this overall approach (further details in Parallel Genetic Programming on a Network of Transputers - John R. Koza and David Andre).

The inter-processor communication can be implemented in a distinct program (even written in a different language... Python?). Vita would required only slight modifications.

Parallelization of the ALPS evolutionary strategy

ALPS is a method to reduce the problem of premature convergence. It measure how long the genetic material has been evolving in the population (age) and segregates individuals into different age-layers (see The Age-Layered Population Structure Evolutionary Algorithm - Gregory S. Hornby).

Regularly it introduces new, randomly generated individuals in the youngest layer (so EA is never completely converged) and uses age to restrict competition and breeding (so that younger individuals are able to develop without being dominated by older ones).

As a side effect ALPS gives the possibility to run multiple EAs simultaneously: multiple instances of a search algorithm are run in parallel, with each instance in its own age layer and having its own population of one or more candidate solutions.

This approach requires various modification to the ALPS code and (as every other internal approach) to the cache / hash table that stores the fitness of the population (it shouldn't be too hard).