Closed jsphon closed 7 years ago
File "/home/jon/Dropbox/PycharmProjects/reinforcement_learning/rl/core/learner.py", line 43, in calculate_action_target
return reward + self.gamma * next_state_action_values[next_state_action]
IndexError: index 1 is out of bounds for axis 0 with size 1
It's because next_state_action_values has shape (1, 4)
Why does it not work?