Closed pierrekhouryy closed 2 years ago
If I understand this correctly, the error is happening on a line of code you've introduced, namely:
rl_algo = PPO("MultiInputPolicy", venv, verbose=1)
Do you have an example of PPO training working outside of imitation
? Right now I do not see how this issue is related to our library.
My guess from the error message is a Gym version incompatibility, I think Gym changed the spaces
attribute a few versions back. It may also be that parking-v0
is not implementing the standard Gym API fully.
Yes, I have two working examples with SB3: with the "parking-v0" env:
import gym
import highway_env
from stable_baselines3 import PPO
env = gym.make("parking-v0")
model = PPO("MultiInputPolicy", env, verbose=1)
model.learn(total_timesteps=20000)
model.save("ppo_parking")
del model
mode = PPO.load("ppo_parking")
and another one with the env "FetchReach-v1":
import gym
from stable_baselines3 import PPO
env = gym.make("FetchReach-v1")
model = PPO("MultiInputPolicy", env, verbose=1)
model.learn(total_timesteps=20000)
model.save("ppo_fetch")
model.load("ppo_fetch")
del model
model = PPO.loading("ppo_fetch")
Both are dict obs, and both seem to train, save and load fine with sb3.
However, running them with imitation causes some kind of error regarding the observations.
In the example of FetchReach-v1
, we are able to lead the model, however it later crashes when the method
generate_trajectories
is called:
File "imitation/src/imitation/data/rollout.py", line 390, in generate_trajectories
exp_obs = (n_steps + 1,) + venv.observation_space.shape
TypeError: can only concatenate tuple (not "NoneType") to tuple
Your examples are passing in a single environment to PPO, not a vectorized environment as imitation expects. Your imitation example is incomplete (I do not see the code creating venv
), so it's hard to know exactly what's going on, but I expect a vectorized vs non-vectorized mixup.
Closing due to inactivity.
Describe the bug
Cannot load pre-trained PPO model in script "train_rl.py"
System Specifications
Expected Behavior
The model was supposed to load and later generate an expert dataset based on this model.
Actual Behavior
the code is crashing and I'm are getting the following error:
Relevant Screenshots / Outputs
on the following line:
rl_algo = PPO("MultiInputPolicy", venv, verbose=1)
, train_rl.py is crashing and giving this error: AttributeError: 'Box' object has no attribute 'spaces'Steps to Reproduce the problem
imitation/src/imitation/scripts/config/train_rl.py
:Then, in the setup.py, change the version of stable_baselines from "stable-baselines3>=1.1.0" to "stable-baselines3==1.2.0", and add the highway_env library. Finally, add the following code to the file
imitation/src/imitation/scripts/train_rl.py
: