Replacing custom CMA-ES with Evosax in CMA-ME.

adaptive-intelligent-robotics / QDax

Accelerated Quality-Diversity

https://qdax.readthedocs.io/en/latest/

MIT License

263 stars 44 forks source link

Replacing custom CMA-ES with Evosax in CMA-ME. #160

Open TemplierPaul opened 11 months ago

TemplierPaul commented 11 months ago

Evosax in CMA-ME:

CMA-ME emitter with Evosax ES instead of custom CMAES, to allow for Sep-CMA-ME or other ES.
- 3 types of CMA-ME emitters: imp, opt and rnd
- ES parameters dict as an emitter __init__ argument
Adapted Evosax cma_criterion for CMAES restart criterion

CMA-ME for policies:

QDaxReshaper class to switch between QDax ANN and Evosax vectors
- Requires jax.disable_jit() when created because of list issues with Jax
CMA-ME emitter for ANN optimization using Evosax
- defaults to SepCMAES for memory footprint reduction with large ANN
- imp, opt and rnd emitters
Pool emitter for CMA-ME emitters with policies

2 example notebooks for evosax CMA-ME:

Notebook evosax_cmame.ipynb: arm (vector)
- es_type = "custom" uses the original QDax CMAES for comparison.
Notebook policy_cma_me.ipynb: pointmaze (ANN)

Import without evosax warns but shouldn't fail.

limbryan commented 11 months ago

Hi @TemplierPaul,

Thanks for the PR. This is a feature we have wanted for awhile so its great to see some work on this! I have some questions.

is it possible to be more general with the emitter to make an evosaxemitter that can easily take in any ES from evosax and be not specific to just CMA-ES, and then maybe for the algorithms itself (like CMA-mE), we can easily inherit from such a class? (This is more of a question as I am not super familiar with evosax)
To run the PR tests and make it possible to compile the docs and run the automated pipeline, could you: a. merge develop into your branch? b. and change the version of python in the .readthedocs.yaml to 3.9 ? Hopefully this works.

Thanks!

TemplierPaul commented 11 months ago

Hi @limbryan,

I added evosax as requirement, checks are now passing. I'm not sure if I should directly update other parts of the doc to add it.

When creating the EvosaxCMAMEEmitter you should be able to use any ES from Evosax with the es_type field (the list is available in evosax.Strategies.keys()) and the hyperparameters for the ES such as sigma_init go through the es_params field. Default ES are CMA-ES for the base CMA-ME and Sep-CMA-ME for the policy one. I'm not sure what to remove from the emitter to make it less CMA-ME specific, you can inherit from it and change the _ranking_criteria method to customize it more.

For the algorithm I'd like to make a CMA-MAE version in the future, but that will probably impact the MAPElites and MapElitesRepertoire classes more.

limbryan commented 11 months ago

Thanks @TemplierPaul !

Do you think a more general EvosaxEmitter would be better then? And then the current EvosaxCMAMEEmitter can inherit from this and use all the related cma emitter tools in qdax that you are currently using. This way the EvosaxEmitter can be more generally used, independent and also not reliant on some specific cma tools from qdax. What do you think?

TemplierPaul commented 11 months ago

I would say the evosax ask/tell interface is quite simple already and it's probably easier/cleaner to re-implement the ES part for new emitters. With ME + ES there is usually the question of emitting the whole ES population or only the ES center to the repertoire so we'd need to have both for a general emitter, which adds complexity. The QDaxReshaper class handles the interface with ANN which has been the annoying part in my experience, but maybe you have a more specific case in mind?

Lookatator commented 10 months ago

Hi @TemplierPaul, thank you very much for this PR!

I have just changed the target branch from main to develop. From now on, each time you push a new commit, the pipeline with all the tests should be triggered (for now, only the pipeline for the doc is triggered, if I am not mistaken).

Regarding the code architecture, I think @limbryan is right, there should be a way to make the overall code more modular. I think it would be great to avoid having 2 implementations of all the CMA-ME emitters. But such refactoring may be beyond the scope of this PR...