ugo-nama-kun / DQN-chainer

MIT License
203 stars 41 forks source link

Error after run Freeze learning for Evaluation step #1

Closed deruci closed 9 years ago

deruci commented 9 years ago

Hi.

I'm using

ALE 0.5.0 RL-Glue 3.04 Build 909 and very recent chainer

after this message (in experiment_ale.py)

DQN-ALE Experiment starting up! RL-Glue Python Experiment Codec Version: 2.02 (Build 738) Connecting to 127.0.0.1 on port 4096... Freeze learning for Evaluation Evaluation :: 945 steps -21.0 total reward DQN is Learning Episode 1 825 steps -21.0 total reward DQN is Learning Episode 2 1066 steps -21.0 total reward DQN is Learning Episode 3 1041 steps -20.0 total reward DQN is Learning Episode 4 1318 steps -19.0 total reward DQN is Learning Episode 5 885 steps -21.0 total reward DQN is Learning Episode 6 1414 steps -18.0 total reward DQN is Learning Episode 7 764 steps -21.0 total reward DQN is Learning Episode 8 1039 steps -20.0 total reward DQN is Learning Episode 9 885 steps -21.0 total reward Freeze learning for Evaluation Evaluation :: 884 steps -21.0 total reward

the agent part get error like below:

Traceback (most recent call last): File "dqn_agent_nature.py", line 303, in AgentLoader.loadAgent(dqn_agent()) File "/home/deruci/anaconda/lib/python2.7/site-packages/rlglue/agent/AgentLoad er.py", line 58, in loadAgent client.runAgentEventLoop() File "/home/deruci/anaconda/lib/python2.7/site-packages/rlglue/agent/ClientAge nt.py", line 144, in runAgentEventLoop switchagentState File "/home/deruci/anaconda/lib/python2.7/site-packages/rlglue/agent/ClientAge nt.py", line 139, in Network.kAgentStep: lambda self: self.onAgentStep(), File "/home/deruci/anaconda/lib/python2.7/site-packages/rlglue/agent/ClientAge nt.py", line 62, in onAgentStep action = self.agent.agent_step(reward, observation) File "dqn_agent_nature.py", line 246, in agent_step self.DQN.experienceReplay(self.time) File "dqn_agentnature.py", line 133, in experienceReplay loss, = self.forward(s_replay, a_replay, r_replay, s_dashreplay, episode end_replay) File "dqn_agent_nature.py", line 89, in forward loss = F.mean_squared_error(td_clip, zero_val) File "/home/deruci/anaconda/lib/python2.7/site-packages/chainer/functions/mean _squared_error.py", line 61, in mean_squared_error return MeanSquaredError()(x0, x1) File "/home/deruci/anaconda/lib/python2.7/site-packages/chainer/function.py", line 164, in call self._check_data_type_forward(in_data) File "/home/deruci/anaconda/lib/python2.7/site-packages/chainer/function.py", line 191, in _check_data_type_forward self.check_type_forward(in_type) File "/home/deruci/anaconda/lib/python2.7/site-packages/chainer/functions/mean _squared_error.py", line 17, in check_type_forward in_types[0].shape == in_types[1].shape File "/home/deruci/anaconda/lib/python2.7/site-packages/chainer/utils/type_che ck.py", line 457, in expect expr.expect() File "/home/deruci/anaconda/lib/python2.7/site-packages/chainer/utils/type_che ck.py", line 428, in expect '{0} {1} {2}'.format(left, self.inv, right)) chainer.utils.type_check.InvalidType: Expect: in_types[1].dtype == <type 'numpy. float32'> Actual: float64 != <type 'numpy.float32'>

and in ALE, I can see

Segmentation fault (core dumped)


I used same command as you suggested. Is this problem happened because of the difference of ALE/RL-Glue/Chainer version?

Or

Can you tell me how I can fix this?

Thank you.

ugo-nama-kun commented 9 years ago

hi deruci.

This is the type error suggested by latest chainer. You can fix the error by editing the 88-th line in the nature-version code as

BEFORE: zero_val = Variable(cuda.to_gpu(np.zeros((self.replay_size, self.num_of_actions))))

AFTER: zero_val = Variable(cuda.to_gpu(np.zeros((self.replay_size, self.num_of_actions), dtype=np.float32)))

or, just pulling the latest DQN-chainer ;-)

Thank you for noticing me!

deruci commented 9 years ago

Thank you for fast reply.