araffin / rl-baselines-zoo

A collection of 100+ pre-trained RL agents using Stable Baselines, training and hyperparameter optimization included.
https://stable-baselines.readthedocs.io/
MIT License
1.12k stars 208 forks source link

Inclusion of baseline results #48

Closed sytelus closed 4 years ago

sytelus commented 4 years ago

There should be a way to see your results that tells you what one should expect if you run the training from scratch. At a minimum, there should be information on a number of training steps and eventual 100-episode average that one might expect in baseline but much better would be to show the entire training curve. Without this baseline is not very meaningful as one may never know if they actually replicated the expected result.

Few good RL baseline frameworks do this, for example here is how other framework display their results: Garage, RLLib, Coach. I love the UX that Garage provides as well as Coach's approach of making results as part of repo itself.

Currently, there is benchmark.zip file in the repo but it seems monitor.csv and progess.csv are not helpful (for example, for DQN progress.csv is empty and monitor.csv only has last few rows). Furthermore, these files are not produced at all currently if you run the experiment.

araffin commented 4 years ago

Hello,

At a minimum, there should be information on a number of training steps

You have that in the hyperparameters file and the config file associated with each trained agent (at least starting with release 1.0 of the zoo). The final performance can be found in benchmark.md, note the results correspond only to one seed (it is not meant for quantitative comparison).

Yes, training curve would be a good addition, even better learning curve using a test env periodically (it is planned to be supported with the callback collection), but you would at least 10 runs per algorithm per environment.

it seems monitor.csv

Monitor.csv can give you the training learning curve, which is only a proxy to the real performance.

Furthermore, these files are not produced at all currently if you run the experiment.

If you don't specify a log folder, nothing is produced, yes.

sytelus commented 4 years ago

I think my comment is probably misunderstood. I'm currently trying to train model for Breakout and reproduce the results. There is nothing in this repo that tells me what I should expect and how do I know the training was successful. As it happens, something is possibly broken in OpenAI baselines as well as stable-baselines so the training for Breakout isn't generating graphs that are convincingly converging.

sytelus commented 4 years ago

Also, looks like in current codebase, there is no call to logger.configure() at all made when running training.py. This possibly explains why there are no monitor.csv and progress.csv generated even when log directory is specified.