ArnaudFickinger / gym-multigrid

Lightweight multi-agent gridworld Gym environment
Apache License 2.0
193 stars 40 forks source link

How to interpret the env.render() return? #9

Open noorwertheim opened 2 years ago

noorwertheim commented 2 years ago

Env.render() returns the observation of each agent in the environment. This observation is used to choose the best policy in q-learing. However the env.render() returns an array of arrays containing 2-dim arrays with values I do not understand. Is there anyone who can clarify this to me? How could these observations be used to conduct q-learning?