DLR-RM / rl-baselines3-zoo

A training framework for Stable Baselines3 reinforcement learning agents, with hyperparameter optimization and pre-trained agents included.
https://rl-baselines3-zoo.readthedocs.io
MIT License
2.08k stars 516 forks source link

Use discrete uniform distributions instead of categorical distributions for defining certain hyperparameter spaces #211

Open jkterry1 opened 2 years ago

jkterry1 commented 2 years ago

tl;dr: should we do the hack I propose in this issue https://github.com/optuna/optuna/issues/3335 if the Optuna people don't want to have the this or it doesn't happen in a reasonable length of time? To my knowledge, this should result in meaningfully improved sample efficiency during sampling, especially in environments that are highly sensitive to tuning the discount factor (which is how I stumbled upon this in the first place).

araffin commented 2 years ago

Hello, thanks for suggesting the feature to Optuna. If I understand, you want to automate something that is similar to https://github.com/optuna/optuna-examples/blob/main/rl/sb3_simple.py#L44 ?

My main concern with the hack is the need to add user attributes to actually know what is the value used. Let's see what Optuna devs say.

jkterry1 commented 2 years ago

So it looks like we'd need to add this her per the newest comment in https://github.com/optuna/optuna/issues/3335. I'll ask someone to work on it.