araffin / rl-baselines-zoo

A collection of 100+ pre-trained RL agents using Stable Baselines, training and hyperparameter optimization included.
https://stable-baselines.readthedocs.io/
MIT License
1.12k stars 208 forks source link

enjoy.py throws error about observation space with HalfCheetahBulletEnv-v0 and own saved model #64

Closed kncrane closed 4 years ago

kncrane commented 4 years ago

Hi,

Playing around with stable baselines 2.9.0 (installed with pip) on Ubuntu 18.04.4 LTS with Python 3.6.9, gym 0.16.0, tensorflow 1.14.0 and pybullet 2.6.5.

When I run

python enjoy.py --algo ppo2 --env HalfCheetahBulletEnv-v0 -- folder trained_agents/ -n 150000

all is well.

When I run

python enjoy.py --algo ppo2 --env HalfCheetahBulletEnv-v0 -- folder logs/ -n 150000

so that enjoy.py loads the model I have trained and saved with train.py, I get the following error..

"Error: the environment passed must have at least the same observation space as the model was trained on."

When I've been trying to see what the problem is today I've noticed that, because of the stored hyperparameters for ppo2 HalfCheetahBulletEnv-v0, train.py wraps the training environment in the TimeFeatureWrapper wrapper from utils/wrappers.py but enjoy.py does not because end up going into the elif "Bullet" in env_id: statement in the create_test_env() method in utils/utils.py.

I've looked and the wrapper changes the observation space from (26,) to (27,) so that may be what it is complaining about in the error message.

Am I barking up the right tree and how come the error doesn't occur with the zoo trained_agents saved models?

Thank you!

araffin commented 4 years ago

Hello,

I have tried to reproduce the error but I did not manage...

What I did:

Train a new model (only 100 steps to test):

python train.py --algo ppo2 --env HalfCheetahBulletEnv-v0 -n 100

Enjoy:

python enjoy.py --algo ppo2 --env HalfCheetahBulletEnv-v0 -f logs/ --exp-id 0

exp-id 0 is for loading the latest experiment.

What is your pyyaml version?

PyYAML==5.1.2
kncrane commented 4 years ago

@araffin my PyYAML version is 5.3 but I see what I've done, daft mistake! When I initially ran

python enjoy.py --algo ppo2 --env HalfCheetahBulletEnv-v0 --folder logs/

without the --exp-id param I got the error message 'ValueError: No model found for ppo2 on HalfCheetahBulletEnv-v0, path: logs/ppo2/HalfCheetahBulletEnv-v0.zip' and ended up pulling the zip file out of the HalfCheetahBulletEnv-v0_1 subdirectory and into logs/ppo2/ so that it was in the place I though the load() function was looking for it, but didn't move the associated directory with the config.yaml, obs_rms.pkl and ret_rms.pkl files inside.. and it appears to be that which has caused the model to be located but the observation space error to be thrown. Putting the zip file back where it ought to be and using --exp-id works fine.

Whilst your reading this, you know the benchmark.md file, I'm assuming the n_timesteps column is the number of steps when evaluating, not the number of training timesteps. Is there anywhere that gives the number of training timesteps for the trained_agents. In baselines its 1 million, just wondering if it's the same.

Thanks for responding so quickly, appreciate it.

araffin commented 4 years ago

Is there anywhere that gives the number of training timesteps for the trained_agents

I already answered that question in #38 ;)