araffin / rl-baselines-zoo

A collection of 100+ pre-trained RL agents using Stable Baselines, training and hyperparameter optimization included.
https://stable-baselines.readthedocs.io/
MIT License
1.12k stars 206 forks source link

Missing (or can't find) Hyperparameters #119

Closed jeff-hykin closed 2 years ago

jeff-hykin commented 2 years ago

Describe the bug Hyperparameters seem to be missing from the a2c.yaml

I've searched the repo, but its possible I'm still looking in the wrong place

Code example

atari:
  policy: 'CnnPolicy'
  n_envs: 16
  n_timesteps: !!float 1e7
  lr_schedule: 'constant'

I wasn't able to find a definition for the CnnPolicy

I'm looking for the values of all of these for A2C on breakout v4 without frameskip

    gamma = trial.suggest_categorical('gamma', [0.9, 0.95, 0.98, 0.99, 0.995, 0.999, 0.9999])
    n_steps = trial.suggest_categorical('n_steps', [8, 16, 32, 64, 128, 256, 512, 1024, 2048])
    lr_schedule = trial.suggest_categorical('lr_schedule', ['linear', 'constant'])
    learning_rate = trial.suggest_loguniform('lr', 1e-5, 1)
    ent_coef = trial.suggest_loguniform('ent_coef', 0.00000001, 0.1)
    vf_coef = trial.suggest_uniform('vf_coef', 0, 1)
araffin commented 2 years ago

Hello,

First, please use Stable-Baselines3 version: https://github.com/DLR-RM/rl-baselines3-zoo

I see some values for atari, but not the hyperparameters for breakout

https://github.com/DLR-RM/rl-baselines3-zoo/blob/e62769b8a04157848af57cb169a6715cf50fe5d4/utils/exp_manager.py#L255-L256

They are the same.

I wasn't able to find a definition for the CnnPolicy

Please take a look at SB2/SB3 documentation for that.

I'm looking for the values of all of these for A2C on breakout v4 without frameskip

Current hyperparameters are for A2C with default Atari pre-processing, which includes frameskip.

jeff-hykin commented 2 years ago

thank you @araffin !