LyWangPX / Reinforcement-Learning-2nd-Edition-by-Sutton-Exercise-Solutions

Solutions of Reinforcement Learning, An Introduction
MIT License
2.04k stars 465 forks source link

Exercise 3.18 #94

Open ghost opened 2 years ago

ghost commented 2 years ago

I think E_π[ q_π( S_t, A_t) | S_t = s ] is better