Open terkkila opened 10 years ago
How the reward is propagated back to older states in the trajectory.
How the reward is propagated back to older states in the trajectory.