ikostrikov / pytorch-a2c-ppo-acktr-gail

PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).
MIT License
3.56k stars 831 forks source link

CartPole-v0 model can't be loaded by enjoy.py #148

Closed swenner closed 5 years ago

swenner commented 5 years ago

Baselines Version: 0.1.5 python3 main.py --env-name "CartPole-v0" --num-frames 100000

python3 enjoy.py --load-dir trained_models/a2c --env-name "CartPole-v0"

Traceback (most recent call last):
  File "enjoy.py", line 66, in 
    obs, reward, done, _ = env.step(action)
  File "/home/simon/.local/lib/python3.5/site-packages/baselines/common/vec_env/__init__.py", line 78, in step
    return self.step_wait()
  File "/home/simon/pytorch-a2c-ppo-acktr-git/envs.py", line 145, in step_wait
    obs, reward, done, info = self.venv.step_wait()
  File "/home/simon/.local/lib/python3.5/site-packages/baselines/common/vec_env/vec_normalize.py", line 26, in step_wait
    obs, rews, news, infos = self.venv.step_wait()
  File "/home/simon/.local/lib/python3.5/site-packages/baselines/common/vec_env/dummy_vec_env.py", line 23, in step_wait
    obs_tuple, self.buf_rews[i], self.buf_dones[i], self.buf_infos[i] = self.envs[i].step(self.actions[i])
  File "/home/simon/.local/lib/python3.5/site-packages/gym/wrappers/time_limit.py", line 31, in step
    observation, reward, done, info = self.env.step(action)
  File "/home/simon/.local/lib/python3.5/site-packages/gym/envs/classic_control/cartpole.py", line 92, in step
    assert self.action_space.contains(action), "%r (%s) invalid"%(action, type(action))
AssertionError: array([0, 0]) () invalid
Exception ignored in: .remove at 0x7f2a294b8d08>
Traceback (most recent call last):
  File "/usr/lib/python3.5/weakref.py", line 117, in remove
TypeError: 'NoneType' object is not callable
ikostrikov commented 5 years ago

I'm using python 3.6 and torch from conda. I tested it with the latest baselines and gym.

It seems to work for me.

ikostrikov commented 5 years ago

Thanks for reporting the problem!

Fixed in https://github.com/ikostrikov/pytorch-a2c-ppo-acktr-gail/commit/88080da828dd4132bec0456b996e516fe356f75f