Closed franzoni315 closed 4 years ago
Well, you are right. But shouldn't that be s' belongs to S under first summation? I think that would be more preicise.
Uhmm actually under first summation I would pick s' belongs to S+, since the next state might be the terminal state, and pick s belongs to S, due to my previous post.
I just noticed there is some confusion upon the definition of S+. The text book says:
In episodic tasks we sometimes need to distinguish the set of all nonterminal states, denoted S, from the set of all states plus the terminal state, denoted S+.
However, on the solution, S+ = {Non-terminal states}. The text book calls this S actually, while S+ is the S plus the terminal states, i.e., all possible states.
Does it make sense?
Being more precise in the solution, I think s belongs to S (not S+), since the dynamics would not make much sense for the terminal state, i.e., there are no possible next states or even actions.