Exercise 3.22 - Githubissues

hhoppe commented 2 years ago

Within the summations, should the factor be $\gamma^{2k}$ because there are two steps to cycle around the loop of two states?

vojtamolda commented 2 years ago

Hi! Thanks a lot for submitting the issue. I think you're right. The term in brackets is for the whole cycle, i.e. from the decision point back to the decision point, so k, a step index, should jump by two. It takes two steps to complete the whole cycle.

Here's a corrected solution. Please @hhoppe, let me know if you got the same result. The correct result looks (luckily) very similar.

And here's how I worked out the sum of only the even terms in a geometric series. I hope I haven't embarassed myself by making a silly mistake somewhere...

hhoppe commented 2 years ago

Yes, that matches what I get; thanks so much!

vojtamolda commented 2 years ago

Fixed!

vojtamolda / reinforcement-learning-an-introduction

Exercise 3.22 #7