LyWangPX / Reinforcement-Learning-2nd-Edition-by-Sutton-Exercise-Solutions

Solutions of Reinforcement Learning, An Introduction
MIT License
2.02k stars 466 forks source link

Ex 4.7-A #67

Closed Avalpreet closed 4 years ago

Avalpreet commented 4 years ago

Line 151 : pi[(i, j)] = 0 Why are all pi[(i,j)] initialized to 0???

Shouldn't pi[(i,j)] -> a ?? i.e., randomly mapped to actions in set A = {-5,-4,-3,-2,-1, 0, 1,2,3,4,5}

LyWangPX commented 4 years ago

It is welcomed to bring your solutions since this is a...terrified early script I randomly wrote at best. I can rewrite one but I think you could do that too.

If your proposal makes the entire script works, I can do the update.

Avalpreet commented 4 years ago

Thanks for the response