icaros-usc / pyribs

A bare-bones Python library for quality diversity optimization.
https://pyribs.org
MIT License
208 stars 32 forks source link

[FEATURE REQUEST] Heterogeneous Archive #328

Closed LemonPi closed 1 year ago

LemonPi commented 1 year ago

It would be interesting to explore heterogeneous archives where each measure/behavior can be sampled from different spaces (that do not scale uniformly) potentially with constraints. For example, in pytorch-mppi I developed an autotuner for its hyperparameters that's compatible with the popular ray tune library. The tuner accepts optimizers ranging from CMA-ES, HyperOpt, and fmfn/BayesianOptimization. I'd like to apply a QD optimizer here as well, but AFAIK the archive accepted by pyribs are only homogeneous in the sense that they scale linearly along each dimension. For many hyperparameters and in general, some measures may be better searched on the logscale. For example in MPPI, we have the default search spaces:

                 sigma_search_space=tune.loguniform(1e-4, 1e2),
                 mu_search_space=tune.uniform(-1, 1),
                 lambda_search_space=tune.loguniform(1e-5, 1e3),
                 horizon_search_space=tune.randint(1, 50),

The archive can be stored as-is in a regular grid, mapping to the actual measure values with (for example) scikit-learn transformers. One thing you can do already is leave the mapping to the measuring function, so instead of for example storing sigma in the archive, you store the log of sigma. It would be understandable if you want to leave this to the measuring function as that is probably conceptually cleaner while keeping a linearly scaling archive.

btjanaka commented 1 year ago

Hi @LemonPi, thanks for reaching out!

To make sure I understand your question, when you say that the measures scale linearly along each dimension, are you referring to how an archive like GridArchive divides each dimension into equally-spaced cells? For example, given the range [0, 10], we could have cells at [0, 1, 2, 3, ..., 10]. Then, when you discuss logscale measures, would one case be, given the range [0, 10], to instead have cells at [0, 1, 2, 4, 8]?

If my understanding is correct, I think your idea of taking the log of the measures on your own is the easiest and best solution. While it is possible to modify GridArchive to include transforms as a parameter, the user would still have to create the transformation function and pass it in. In other words, I think it would take the same amount of effort to have the user call the transformation function on their own. Furthermore, our definition of transformation function may ultimately be too restrictive for the user; for instance, some functions could depend on stateful variables.

One final possibility that may be relevant to you is adding the ability for GridArchive to handle arbitrary cell boundaries, e.g., for a range [0, 10], the user could pass in [0, 3, 4, 7, 9, 10]. This would be accomplished by performing a binary search to find the correct bin, similar to what is done in SlidingBoundariesArchive. This is something that I think is reasonably easy to implement and has many potential use cases.

LemonPi commented 1 year ago

Thanks for the consideration. I ended up with handling the transforms in the measure function. I don't think I'll be needing SlidingBoundariesArchive.

btjanaka commented 1 year ago

Sounds good! I wouldn't recommend SlidingBoundariesArchive in your case either; I think it's too complicated for what you are trying to do.