LyWangPX / Reinforcement-Learning-2nd-Edition-by-Sutton-Exercise-Solutions

Solutions of Reinforcement Learning, An Introduction
MIT License
2.04k stars 465 forks source link

Error in solution for 12.2 #54

Closed gakshaygupta closed 4 years ago

gakshaygupta commented 4 years ago

the nth term of the GP is ar^(n-1) but you have mistakenly used a*r^n. Here n=T(lambda)

LyWangPX commented 4 years ago

Well I do not think n = T(lambda) and I realized it is not n and did not say it is n in the original answer due to this ambiguity. I want to reader to check his result with this reference.

However I do change the answer to make it at least handle this problem more generally.

gakshaygupta commented 4 years ago

well don't you think the introduction of the time in your formulation is a bit misleading as T(lambda) is a "half-life constant". And it just specifies the number of steps required to reduce the decay factor to half.

LyWangPX commented 4 years ago

did you check my updated definition?

gakshaygupta commented 4 years ago

did you check my updated definition? Ohh sorry I didn't see the note at the bottom that's why I got confused.