I'm actually reading the 3rd chapter when we try to implement a way to resolve the fronzenlake environnement,
As it is written : "We give +1 point as a reward to the agent if it correctly walks on the frozen lake and 0 points if it falls into the hole"
So my question is, why would the agent want to reach the exit of the Lake if the environnement is not giving him negative reward for each step past on the Lake ?
I thought the agent would want to maximize his rewards by walking infinitely on frozen place which gives him +1 reward.
I'm actually reading the 3rd chapter when we try to implement a way to resolve the fronzenlake environnement,
As it is written : "We give +1 point as a reward to the agent if it correctly walks on the frozen lake and 0 points if it falls into the hole"
So my question is, why would the agent want to reach the exit of the Lake if the environnement is not giving him negative reward for each step past on the Lake ?
I thought the agent would want to maximize his rewards by walking infinitely on frozen place which gives him +1 reward.
Thx in advance