google-deepmind / hanabi-learning-environment

hanabi_learning_environment is a research platform for Hanabi experiments.
Apache License 2.0
644 stars 146 forks source link

Question: Why no observation stacking? #26

Closed carlbalmer closed 4 years ago

carlbalmer commented 4 years ago

In the Paper you state that you did not use any observation stacking outside the previous action by the player.

What is the reasoning for this? Did you try out configurations with stacking and they did not perform well?

Another possibility would have been to include the other players action since the current players last action. Was something like this considered?

mgbellemare commented 4 years ago

Yes, observation stacking didn't seem to provide any immediate benefit. It's possible that recent observations are not more useful than the most recent one.