GA roulette selection method [REG]

Pre-checkings

[x ] Check that you are up-to-date with the master branch of Opytimizer. You can update with: pip install git+git://github.com/gugarosa/opytimizer.git --upgrade --no-deps
[x ] Check that you have read all of our README.

Description

First thanks for the great framework you are providing, it's exactly what i was looking for! I just started using the genetic algorithm when I stumbled across the roulette selection method. For my understanding the method _roulette_selection implemented in opytimizer.optimizers.evolutionary.ga.py is designed for a maximization problem with:

# Calculates the total fitness
total_fitness = np.sum(fitness)

# Calculates the probability of each fitness
probs = [fit / total_fitness for fit in fitness]

# Performs the selection process
selected = d.generate_choice_distribution(n_agents, probs, n_individuals)

If I use my minimization problem with this optimizer, a lower fitness value would correspond to a lower probability of getting selected as parents correct? So the method should be:


total_fitness = np.sum(1-np.array(fitness))

probs = [(1-fit) / total_fitness for fit in fitness]

Tell me if I've overlooked something. Kind regards, Martin

Hello Martin! I hope everything is going well with you.

Firstly, I would like to deeply appreciate for all the nice comments regarding the Opytimizer package, as well as I am truly realized to know that it is being helpful for the community.

Secondly, your observation is completely correct. Due to the library being designed for minimization problems, we need to perform some type of inversion in the fitness function, so it would be able to select the best parents with the highest probabilities. However, the trick that you mentioned would not guarantee that all the probabilities sums up to 1 nor that they would be positive (unfortunately, d.generate_choice_distribution() requires such conditions), as fitness is an unconstrained value and scales from 0 to "infinity".

A common approach to solve that problem (that I found at "Genetic algorithms in search, optimization, and machine learning" by D. Goldberg) is to invert the fitness function based on the maximum fitness of each generation. For example:

f'(x) = f_max - f(x), where f_max is the maximum fitness of current generation agents and f(x) are the current fitness values;
p(x) = f'(x) / sum(f'(x)), where p(x) are the correctly scaled probabilities as you have pointed out.

These changes turned out to the amazing and roughly increased the convergence by 20~30% with a little additional burden. Additionally, I have commited them, so it should be ready to go. Just please make sure that you are installing from pip install . as the changes have not been deployed to PyPI (pip repository).

Thank you again for finding out this bug and sorry about it!

Best regards, Gustavo.

gugarosa / opytimizer

GA roulette selection method [REG] #14

Pre-checkings

Description