The Q table I got from both FrozenLake consists of 0s.

TristanBester / gym_algos

Solutions + Write-ups to OpenAI Gym environments

MIT License

7 stars 2 forks source link

The Q table I got from both FrozenLake consists of 0s. #1

Closed caojilin closed 3 years ago

caojilin commented 3 years ago

After run FrozenLake_v0.py, the agent didn't learn anything. I'm wondering which part is wrong.

TristanBester commented 3 years ago

Thanks for bringing up the issue, I just looked into it now and have resolved the issue.

The issue was in the SARSA implementation. In the training loop when the reward value was overwritten after calling env.step() the value did not persist. Thus, the reward was always zero and hence all states-action pairs were assigned a value of zero.