Open elfelround opened 4 years ago
Thanks for pointing out this errata. You are right, I think I missed counting the final step. At t=7 an action is taken in order to go to the terminal state T.
So for episode 1, t goes from 0 to 8, and recall that the sequence is denoted by
Also , we know the following immediate rewards: R1=0, R2=0, ..., R6=0, R7=0, R8=1
So now let's calculate the returns for the first episode:
Similarly, for episode 2, t goes from 0 to 10, and R0=R1=...=R9=0 while R10=-1
@vmirly posted this errata via packt but it was annoying as heck to explain without an image, they just said, can u further explain? and i thought fuck it. having a corrected answer will facilitate my understanding of this part, ill read it when i have time with my current book and let you know how this follows :) also the RL chapter is still great, but feels rushed in comparison with whole book, maybe a bit more love to it on 4th ed? xx
also the RL chapter is still great, but feels rushed in comparison with whole book, maybe a bit more love to it on 4th ed? xx
Oh yeah for sure. The rewrites for Tf 2.0 took much longer than expected. And we both were very busy in Fall (due to my teaching responsibilities and Vahid starting a new position). It definitely could and should be smoothened out in a potential next edition. Thanks for your feedback!
book errata p 682