How many training steps used to obtain the pre-trained model?

araffin / rl-baselines-zoo

A collection of 100+ pre-trained RL agents using Stable Baselines, training and hyperparameter optimization included.

https://stable-baselines.readthedocs.io/

MIT License

1.12k stars 208 forks source link

How many training steps used to obtain the pre-trained model? #70

Closed xinghua-qu closed 4 years ago

xinghua-qu commented 4 years ago

Is there any document illustrating how many training steps used to obtain the pre-trained model? Some pretrained model seems far less than the start-of-the-art. For instance, the dqn model on BeamRider and Qbert only achieve 948.0 and 550.0. However, using other policies (e.g., PPO2 and ACKTR), such reward values could be 10,000+. It would be better if you can provide these pre-trained models as a trustworthy baseline for benchmarking.

araffin commented 4 years ago

duplicate of #38