Farama-Foundation / Minigrid

Simple and easily configurable grid world environments for reinforcement learning
https://minigrid.farama.org/
Other
2.09k stars 604 forks source link

Fetch does not seem to produce the negative reward #170

Closed manuel-delverme closed 2 years ago

manuel-delverme commented 2 years ago

This environment has multiple objects of assorted types and colors. The agent receives a textual string as part of its observation telling it which object to pick up. Picking up the wrong object produces a negative reward.

maximecb commented 2 years ago

Good catch. It produces a zero reward. There used to be negative rewards but I found that this often made it hard to converge, because the agent tends to learn that doing nothing is better than acting, gets trapped in passive a local maxima.