LyWangPX / Reinforcement-Learning-2nd-Edition-by-Sutton-Exercise-Solutions

Solutions of Reinforcement Learning, An Introduction
MIT License
2.02k stars 466 forks source link

Ex 4.1 #78

Open StoyanVenDimitrov opened 3 years ago

StoyanVenDimitrov commented 3 years ago

Hi,

how do you come to value of state 11 being -14?

Kin-Zhang commented 3 years ago

In example 4.1 This is an undiscounted episodic task, the reward is -1 on all transitions until the terminal state is reached

Since state 11 is not the terminal state so it's reward is -14