Ch 10 Ex. 10.6 - Githubissues

LyWangPX / Reinforcement-Learning-2nd-Edition-by-Sutton-Exercise-Solutions

Solutions of Reinforcement Learning, An Introduction

MIT License

2.02k stars 466 forks source link

Ch 10 Ex. 10.6 #64

Closed KimMatt closed 4 years ago

KimMatt commented 4 years ago

Hi I'm currently going through the exercises for this book as well: https://github.com/KimMatt/RL_Projects

In your solution to 10.6 how did you get from

$E[R_{t 1}|S_0=s] - r(\pi)$

$\frac{-1^t}{2}$

LyWangPX commented 4 years ago

Hi there, I think expanding the reward series and minus the average reward would result to this. By the way, the book updated 10.6 and 10.7 and exchanged them. Please make sure you are using the newest one.

Cheers

Edit: I will close the issue for now and if you found it unclear, you can comment it below and I will re-open it for you.

KimMatt commented 4 years ago

Ok thanks. Gonna have another crack at it