slds-lmu / qdo_yahpo

A Collection of Quality Diversity Optimization Problems Derived from Hyperparameter Optimization of Machine Learning Models
BSD 3-Clause "New" or "Revised" License
6 stars 0 forks source link

Some comments #1

Open dietmarwo opened 1 year ago

dietmarwo commented 1 year ago

Congratulations, qdo_yahpo and yahpo_gym are excellent projects. I am very impressed with the level of parallelization achieved.

Parallelization can be implemented at the level of fitness evaluation - as you do - or already inside the optimizer itself - an approach implemented in https://github.com/dietmarwo/fast-cma-es (fcmaes).

I used/adapted your code to apply fcmaes to your benchmark problems. See https://github.com/dietmarwo/fast-cma-es/blob/master/examples/yahpo.py fcmaes handles some things differently, may be it is interesting for you to compare the approaches.

Results cannot be directly compared since different tesselation is used (Grid/Voronoi), but my impression is, that both with 1E5 and with 1E6 evaluations per run the results are better than all alternatives tested at qdo_yahpo. Main reason seems to me that CR-FM-NES (see https://arxiv.org/abs/2201.11422) seems to work better than CMA-ES as emitter for the tasked benchmarked by qdo_yahpo. For other tasks the advantage of CR-FM-NES can be even larger, see my test results in https://github.com/google/evojax/pull/52 . SBX/mutation used for Map-Elites may be another reason.

Regarding wall time there is only about factor 2 improvement compared to qdo_yahpo (tested on 16 core AMD 5950x), but fcmaes doesn't require a multi-solution-fitness performing parallelization, since parallelization is handled inside the optimizer. Your parallelized fitness evaluator already does a very good job regarding scaling, otherwise the difference would be much larger.

sumny commented 1 year ago

Thanks for the comments! I am happy to see that you find the benchmarks useful! CR-FM-NES looks interesting - nice to see that one can improve upon the baselines we tested on the benchmarks! I'll leave the issue open - because someone interested in the benchmarks might also find your comments very helpful :)