Open Yelinz opened 6 months ago
Fix the dynamics of the program to be the same as mentioned in the exercise.
Mentioned in: https://github.com/LyWangPX/Reinforcement-Learning-2nd-Edition-by-Sutton-Exercise-Solutions/issues/85
new output which reflects the mentioned solution. Previously running the script returned the wrong state values.
---------------------- | 0 | -14 | -20 | -22 | ---------------------- | -14 | -18 | -20 | -20 | ---------------------- | -20 | -20 | -18 | -14 | ---------------------- | -22 | -20 | -14 | 0 | ---------------------- | | -20 | | | ---------------------- Accurate State Values List: State 1: -14.089474627936156 State 2: -20.11529496144879 State 3: -22.116550116550094 State 4: -14.131432669894195 State 5: -18.153128922359677 State 6: -20.139860139860122 State 7: -20.117805271651406 State 8: -20.241169087322913 State 9: -20.251748251748232 State 10: -18.173211403980616 State 11: -14.097005558544007 State 12: -22.340326340326317 State 13: -20.439483593329726 State 14: -14.204231665770113 State 15: 0 State 16: -20.961628115474248
Fix the dynamics of the program to be the same as mentioned in the exercise.
Mentioned in: https://github.com/LyWangPX/Reinforcement-Learning-2nd-Edition-by-Sutton-Exercise-Solutions/issues/85
new output which reflects the mentioned solution. Previously running the script returned the wrong state values.