rodrigo-arenas / Sklearn-genetic-opt

ML hyperparameters tuning and features selection, using evolutionary algorithms.
https://sklearn-genetic-opt.readthedocs.io
MIT License
286 stars 73 forks source link

Question about selection and crossover #130

Closed mario-sanz closed 1 year ago

mario-sanz commented 1 year ago

Hello,

I have been trying to understand the selection and crossover methods for GASearchCV but I still have some doubts. I am using the default algorithm (eaMuPlusLambda), but in the implementation it appears that both mu and lambaare set to None.

If I am not wrong, these parameters establish the following:

If both of them are set to None, then I don't understand which percentage of a new generation is parents from the previous one and which percentage is mutated children of crossed parents.

I believe that the reproduction process is the following one:

  1. Selection: With the chosen selection method, select individuals that will produce next generation.
  2. Crossover: Apply crossover to some of the selected individuals according to crossover probability.
  3. Mutation: Apply mutation to the resulting population according to mutation probability.

My question is, the next generation, is composed only of probably mutated children of crossed parents? Or are there also parents that are not crossed? In this second case, which is the percentage of children and parents?

Thanks a lot in advance!

Mario

rodrigo-arenas commented 1 year ago

Hi @mario-sanz The default value in the stand-alone algorithms is None but inside the GASearchCV class those values are overwritten to mu=self.population_size and lambda_=2 * self.population_size so they're never really None, this also answers the question of how many individuals are created in the next gen (read more in the end).

You're about right about the steps done in the reproduction process; in the next generation you'll have crossed individuals, and mutated individuals (from the crossed ones), additionally you could have the best individuals from the previous generation, this happens if the parameter elitism is set to True (which is the default) or because the selected algorithm variation.

Keep in mind that depending on the algorithm it might generate the new population from the offspring and and the current population (eaMuPlusLambda) or only from the offspring (eaMuCommaLambda), so if elitism is set to False it won't always have individuals from the current gen passed to the next one

Also, keep in mind that depending on the algorithm you selected a few things change about how many individuals to create and some criteria on how to create them, for example, one aspect is the "variation methods": varAnd (used in eaSimple) and the varOr (used in eaMuPlusLambda and eaMuCommaLambda).

You can read more about those here: varAnd varOr

I hope this helps

rodrigo-arenas commented 1 year ago

I'm closing this issue for now, let me know if more questions arise

mario-sanz commented 1 year ago

Hi @rodrigo-arenas

Sorry for the late response. Yes, your answer was really helpful. You solved all my doubts. Thank you very much!