Closed carlbalmer closed 4 years ago
In the Paper you state that you did not use any observation stacking outside the previous action by the player.
What is the reasoning for this? Did you try out configurations with stacking and they did not perform well?
Another possibility would have been to include the other players action since the current players last action. Was something like this considered?
Yes, observation stacking didn't seem to provide any immediate benefit. It's possible that recent observations are not more useful than the most recent one.
In the Paper you state that you did not use any observation stacking outside the previous action by the player.
What is the reasoning for this? Did you try out configurations with stacking and they did not perform well?
Another possibility would have been to include the other players action since the current players last action. Was something like this considered?