Closed jsboige closed 3 years ago
According to the previous remark, I created pull #83 to merge existing operators underneath a common inherited base class and allow for more abstraction.
I believe a good reference would be mealpy, which implements in Python many modern meta-heuristics enhancing evolutionary search techniques. Most of them seem to improve the general stochastic beam search of GAs with more sophisticated custom search guidance for segregated individuals and/or phases/generations, allowing for optimising the exploration/exploitation compromise, which seems to be the main issue at stake here.
With the addition of dedicated new interface(s), maybe together with an Island wrapper above in some cases, running several GAs with population migrations, this new entry point should allow to implement most modern variations.
Thanks for the report and pull request.
The pull request #83 was merged.
Following investigations about issue #47, I'm a bit puzzled with the evolution workflow and the resulting behavior. It seems to me that on most tasks, Elite selection and Elitist reinsertion are constantly outperforming any other alternative operators, that accordingly, preserving genetic diversity is hard with usually a quick collapse to sub-optimal solutions, and that more generally despite an abundance of parameters or extensive code coverage, parameter tuning is pretty hard.
I would be willing to investigate contributing extensions dealing specifically with those issues, such as Island GAs variants or other meta-heuristics, but I believe the basic workflow has existing issues that might need addressing first.
Let me try to follow a generation cycle starting at Crossover stage, which is an interesting entry point for improvements IMHO. Returning from IOperatorsStrategy.Cross we get a number of children <= MinSize, depending on child/parent ratio (1 for most crossovers) and crossover probability (<=1). Pair-wise parents matching is performed without specific attention, simply following parents order returned by selection.
After crossover, offspring are mutated on an individual basis, without impacting the population globally.
Reinsertion operator doesn't seem right: Elitist and Uniform reinsertion will fill up the new population to MinSize respectively with best parents and random offspring copies. I wasn't able to get Uniform reinsertion to work at all, with fitness consistantly not improving. Pure reinsertion will only work if crossover yields the same number of offspring as parents (certain crossover are not permitted + crossover probability must be equal to 1 + MinSize must be an even number). Just as for uniform reinsertion, with the appropriate population size compatible settings, I wasn't able to get any result at all. FitnessBasedReinsertion also has issues:
That leaves us with apparently only Elitist reinsertion usable, that is, the necessity to keep best parents in subsequent generations.
Either way, a typical generation currently ends with a population of unordered children filled up with best parents or additional children if needed, to a total number of MinSize, then sorted twice by Fitness value in GA.EvaluateFitness and again in Generation.End, which also trims to MaxSize (both seem unneeded)
At the Selection stage of next generation, the selection operator might do differents things according to the type used: In all cases, it will yield a MinSize number of parents.
The whole workflow suggests to me a revamp might be needed to make sense of the currently available options:
Evolution of population size might need a review. Population Maxsize is enforced in several places though effectively unused: only Minsize seems to be of any use for now; accordingly FitnessBased reinsertion or any collapse capable reinsertion also seem unusable. Maybe Crossover operator strategy could be updated to allow for modulating offspring population size, independently of Crossover probability and crossover operators.
More attention could be put into population order / sorting by fitness. When using Elite reinsertion and Elite Selection, as far as I can see, population will be sorted 4 times by fitness in GA.EvaluateFitness, Generation.End, EliteSelection.PerformSelectChromosomes and ElitistReinsertion.PerformSelectChromosomes. Then, according to the current implementation, crossover pairwise matching, which will greatly influence search dynamics, relies on post selection order, in what seems a non consistent way (see StochasticUniversalSampling above). Finally about the sorting overhead itself, that does not seem too much, but it can also be discussed. Linq OrderByDescending, is mostly used in several places, and uses Quicksearch internally, which includes an overhead with already sorted sets, and even more so with near sorted sets. Interestingly, Insertion sort performs much better with near sorted sets, which might be of consideration with evolution depending on operators, and even though Quicksearch is interesting when only a subset of the first sorted elements is needed, the corresponding improvements documented in the following article were not accounted for in the latest linq internals, and a full quicksort is still triggered every time. Accordingly there are 2 issues here:
The very idea of FitnessBased reinsertion challanges the current operations sequence. Either reinsertion should not be concerned with offspring fitnesses at all, leaving that to the selection operators, or the current workflow should be updated to allow for fitness evaluation before reinsertion.