google-deepmind / meltingpot

A suite of test scenarios for multi-agent reinforcement learning.
Apache License 2.0
582 stars 118 forks source link

Proposal: do something about redundantly large observations #152

Closed dimonenka closed 1 year ago

dimonenka commented 1 year ago

The observations are large images due to spriteSize=8. However, most of this information is redundant. In the rllib example, a convolutional layer is applied with stride 8 to leave only the first pixel out of 8x8=64 pixels, meaning that observations are x64 bigger than needed. This might be fine for on-policy methods, but for methods that use replay buffers, this transforms several gigabytes of memory into several hundred gigabytes. I had to change spriteSize=8 to spriteSize=1 in the config files of specific environments which is fine but not the most elegant (or intended) solution.

As far as I understand, spriteSize=8 is only useful for rendering. So a good solution to this issue could be to use spriteSize=1 everywhere but transform small images into bigger images as a subroutine during rendering. A worse but maybe easier solution would be to allow users to change spriteSize in the config (and also explain this parameter in the examples).

(this is based on the state of the repo a couple of weeks ago, I don't know if anything changed)

duenez commented 1 year ago

Because the test scenarios have pretrained agents with spriteSize 8, we shouldn't change that. I agree that for many solutions, collapsing the 8x8 sprites to 1x1 is reasonable. That is what we are recommending as an entry-level approach at training agents in the MeltingPot Competition coming up. However, changing this in the config isn't the best. So, what we think is the best option is to write a substrate wrapper that just averages (i.e. with numpy convolutions) the 8x8 sprite.