jsphon / reinforcement_learning

Python Package For Reinforcement Learning
0 stars 0 forks source link

VectorizedStateMachineTargetArrayCalculator. #33

Closed jsphon closed 7 years ago

jsphon commented 7 years ago

It's not working as the ant world example won't work.

Make some tests for it.

jsphon commented 7 years ago

The defect is here:

        for action in range(num_actions):
            next_int_ext_state = self.rl_system.model.apply_action(int_ext_state, action)
            reward = self.rl_system.reward_function(int_ext_state, action, next_int_ext_state)
            targets[action] = self.get_target(next_int_ext_state, action, reward)

The problem is that the next state has changed internal state, but the reward function only looks at the final state. So we need to fix the reward function.