Exercise 3.18 - Githubissues

LyWangPX / Reinforcement-Learning-2nd-Edition-by-Sutton-Exercise-Solutions

Solutions of Reinforcement Learning, An Introduction

MIT License

2.04k stars 465 forks source link

Open ghost opened 2 years ago

ghost commented 2 years ago

I think E_π[ q_π( S_t, A_t) | S_t = s ] is better