openai / procgen

Procgen Benchmark: Procedurally-Generated Game-Like Gym-Environments
https://openai.com/blog/procgen-benchmark/
MIT License
1.01k stars 209 forks source link

Miner incorrectly wins game if the agent spawns on the exit due to incorrect diamond count initialization #67

Open jordan-schneider opened 3 years ago

jordan-schneider commented 3 years ago

I don't know if this counts as something that won't be fixed for backwards compatability, but if the agent starts an episode on the exit (as in seed 1926574277), then

  1. miner.cpp initializes diamonds_remaining as 0
  2. Miner::game_step calls BasicAbstractGame::game_step() first before updating diamonds_remaining
  3. BasicAbstractGame::game_step() calls for collision handling
  4. Miner::handle_agent_collision checks the (unchanged) diamonds_remaining value, sees it's zero, and immediately exits the episode.
jordan-schneider commented 3 years ago

This is easily fixed by initializing diamonds_remaining to a sentinel value like -1. If this is something y'all are interested in fixing I'd be happy to write the PR.

christopherhesse commented 2 years ago

Thanks for investigating and letting us know, but unfortunately we don't want to change the behavior of the published environments. I think there has been a couple more reports like this, so if you wanted to make a PR to start a KNOWN_ISSUES.md file with this issue, linked from the README, I'd be happy to approve that.

christopherhesse commented 2 years ago

Could also note the bug in a code comment in miner.cpp so that people would be more likely to see it.

christopherhesse commented 2 years ago

Here is the other issue I found: https://github.com/openai/procgen/issues/59