tambetm / simple_dqn

Simple deep Q-learning agent.
MIT License
695 stars 184 forks source link

AssertionError: Cannot call env.step() before calling reset(). How to remove the error? #45

Closed ayushya2531 closed 7 years ago

ayushya2531 commented 7 years ago

./train.sh MsPacman-v0 --environment gym --backend cpu [2017-04-20 02:37:58,842] Making new env: MsPacman-v0 [2017-04-20 02:37:58,917] Using Gym Environment [2017-04-20 02:37:58,923] Replay memory size: 1000000 [2017-04-20 02:37:58,923] Initialized NervanaCPU [2017-04-20 02:37:58,924] Backend: cpu, RNG seed: None [2017-04-20 02:37:59,082] Results are written to results/MsPacman-v0.csv [2017-04-20 02:37:59,086] Populating replay memory with 50000 random moves Traceback (most recent call last): File "src/main.py", line 136, in agent.play_random(args.random_steps) File "/home/ayushya/neon/simple_dqn/src/agent.py", line 91, in play_random action, reward, screen, terminal = self.step(1) File "/home/ayushya/neon/simple_dqn/src/agent.py", line 65, in step reward = self.env.act(action) File "/home/ayushya/neon/simpledqn/src/environment.py", line 135, in act self.obs, reward, self.terminal, = self.gym.step(action) File "/home/ayushya/neon/.venv2/local/lib/python2.7/site-packages/gym/core.py", line 99, in step return self._step(action) File "/home/ayushya/neon/.venv2/local/lib/python2.7/site-packages/gym/wrappers/time_limit.py", line 35, in _step assert self._episode_started_at is not None, "Cannot call env.step() before calling reset()" AssertionError: Cannot call env.step() before calling reset()

The model is not getting trained due to this issue . I have followed the steps as mentioned in the readme file. Please help me out. I am new to this so ask me to post any other information that you may need. Thank you.

tambetm commented 7 years ago

I see your problem. The actual issue is that play_random() does not call self._restartRandom(). Actually we don't care about random restarts in this case, but calling self.env.restart() directly wouldn't populate self.buf and maybe that would cause troubles. As we are doing random actions anyway, self._restartRandom() is fine. Would be happy to accept tested PR on that!

tambetm commented 7 years ago

Merged #47.