corl-team / xland-minigrid

JAX-accelerated Meta-Reinforcement Learning Environments Inspired by XLand and MiniGrid 🏎️
Apache License 2.0
192 stars 15 forks source link

observations of env when not RGB #15

Closed wcarvalho closed 6 months ago

wcarvalho commented 6 months ago

Hello,

great codebase. I just want to double-check my understanding. for non-RGB experiments you've done, you use the uint8 grid values as inputs? I was expecting to see you convert them to 1-hots or something.

Thanks

Howuhh commented 6 months ago

Hi @wcarvalho! That's a good question! Converting them to embeddings or one-hot is the most correct way to work with this representation and I mention it in the paper. However, the baselines are currently use naive approach, which was taken from the original baselines for minigrid, where the "picture" is fed to CNN as is. While wrong in general, this allows testing generalization to new object ids, while one-hot does not. And it kinda works somehow..