Feature/hp tuning - Githubissues

visuallization commented 1 year ago

added hyperparameter tuning example file for sb3 for automated hp search
utilizes optuna for hp tuning

visuallization commented 1 year ago

This is a great idea, the optuna branch looks interesting.

I can offer some suggestions to consider, e.g. we can get the number of agents from the environment, wrap both environments (test and eval) using the same VecMonitor to resolve a warning or error message related to it, and potentially set different ports for the two environments (not sure if this is necessary).

However, I would like to check something first, even though the main repository for optuna is MIT licensed and e.g. the Dashboard repo also has MIT license stated, I couldn't find any license specified at the examples repo where the original script is from: https://github.com/optuna/optuna-examples/tree/main. Perhaps it's specified somewhere else, but I couldn't find the specific license for that repo during a quick search. I would like your opinion on this.

@Ivan-267 I just asked: https://github.com/optuna/optuna-examples/issues/199

visuallization commented 1 year ago

This is great, it is a shame we add another dependency (optuna) but I guess that is ok.

LGTM :)

@edbeeching I could also remove it and just let people know they should install it when running the script. :)

visuallization commented 1 year ago

@Ivan-267 Thanks for your suggestions. They make perfectly sense. I now get the number of agents dynamically and wrap the training env also with a vec monitor. The reason why I didn't do it initially was because of an error but it seems to work fine now.

Ivan-267 commented 1 year ago

Great additions, and thanks for checking regarding the license.

For potentially removing the dependency, in case you decide to go that route, as you already mentioned we could have it mentioned somewhere in the docs for the script (either in adv_stable_baselines docs or some file for that example specifically) that for using this example, pip install optuna is needed first. It seems like an OK option if the script is not always used with sb3, however if it turns out to be a very frequently used option we can always (re-)add it as a dependency.

I'm not sure that this is necessary, but we could set a unique port for the eval environment just in case (the n_parallel environments use unique ports for each instance already):

One option is something like this:

from godot_rl.core.godot_env import GodotEnv

    # Create the RL model.
    training_port = GodotEnv.DEFAULT_PORT + 1
    model = PPO("MultiInputPolicy", VecMonitor(StableBaselinesGodotEnv(env_path=args.env_path, speedup=args.speedup, n_parallel=args.n_parallel, port=training_port)), tensorboard_log="logs/optuna", **kwargs)
    # Create env used for evaluation.
    eval_env = VecMonitor(StableBaselinesGodotEnv(env_path=args.env_path, speedup=args.speedup))

By using the `GodotEnv.DEFAULT_PORT + 1' the training envs will start from one port higher than the eval env.

edbeeching commented 1 year ago

Hey @visuallization , this LGTM. If you want to change the port for the eval / training env, then we can merge?

visuallization commented 1 year ago

@Ivan-267 thanks for your port suggestions I added them.

@edbeeching I also print now a custom error message when users are running the script without optuna being installed. will merge this now. Cheers

visuallization commented 1 year ago

@Ivan-267 I just asked: optuna/optuna-examples#199

@Ivan-267 This has been resolved now: https://github.com/optuna/optuna-examples/issues/199

edbeeching / godot_rl_agents

Feature/hp tuning #130