beyretb / AnimalAI-Olympics

Code repository for the Animal AI Olympics competition
Apache License 2.0
573 stars 84 forks source link

Better rendering during training #21

Closed yding5 closed 5 years ago

yding5 commented 5 years ago

Hi, I found the window showing the environment during training very helpful for observing the agent behavior. However, the FPS on my machine is very low (up to 4 frames/s) and stuck from time to time. Meanwhile, both CPU and GPU utilization is low.

I have read that it might be related to the setup used in Unity to speed up training but I don't know whether this is a issue with my machine or it is the same for all of us. Is there any way to improve it or you have any suggestion to observe the agent behavior conveniently? Thanks!

beyretb commented 5 years ago

Hello,

Are you running one of the examples provided or your own implementation? Also, please provide your hardware/OS setup?

Keep in mind that the environment will freeze between actions, therefore it may have to do with the time it takes for training steps for example. The PPO example provided in examples/trainMLAgents.py takes a training step after most observations are received, which explains the jerky rendering.

yding5 commented 5 years ago

Hello,

Yes I was running the PPO example. Sorry for forgetting to mention that. The setup is intel X5460, GTX 980Ti, and ubuntu 16.04 LTS. I tried the Rainbow example which has a much more smoother rendering. I haven't fully understand what's the meaning of "most observations" but I will try to look at it further when trying my own implementation to see will there be any problems. Thanks.

beyretb commented 5 years ago

Hi, I used "most" as technically speaking training may not start right away as you may need to collect a certain number of observations to fill the replay buffer first, but that's a technicality really...

Also, be aware that all the code provided, even the environment, barely make any use of the GPU (hence the low GPU utilisation). Switching to tensorflow-gpu for Rainbow should speed things up and make use of the GPU.

yding5 commented 5 years ago

Hi, I did some research on this. When I run the PPO example, although the frame rate of the window showing the top view is very low as described previously, the training speed is about 30 steps/s. The same thing happens with the visualizeLightsOff.py: the update rate of the displaying window is bad and kind of out-of-control by change the interval in https://github.com/beyretb/AnimalAI-Olympics/blob/4cead96af383435329472082f0e69d78a13cfc35/examples/visualizeLightsOff.py#L62

Is is possible that the reason is the rendering for the window is independent or at least not synchronized with the simulation steps? Ref: https://github.com/Unity-Technologies/ml-agents/issues/299

Overall, we are trying to get the top-view at each step along with the agent-view for analysis. One suggestion is to add an additional agent or observation and get it from visual observation interface (https://github.com/Unity-Technologies/ml-agents/issues/134). However, it seems to us that it is not possible to do this because we are using the executable environment you provided. It that right? If so, one work around we are thinking is getting the screenshot from the window but then it comes the refresh rate issue of the window.

beyretb commented 5 years ago

Hello,

Is is possible that the reason is the rendering for the window is independent or at least not synchronized with the simulation steps?

Yes that's very likely part of the explanation as well, we maximise the number of frames for the agent independently of the actual screen rendering.

However, it seems to us that it is not possible to do this because we are using the executable environment you provided. It that right?

correct, there is no brain attached to the top-down camera at the moment

One suggestion is to add an additional agent or observation and get it from visual observation interface

This is feasible, I will look into adding one camera per arena (as presently there is only one for the whole environment) and attach these to an extra brain. I will need to run some tests to see by how much this slows down training and will decide how to proceed later on. We are working on having the competition ready by the end of the month at the moment, I will look into suggestion once this is done.

Keep in mind however that these extra observations would be for information purposes only and will not be provided at test time as it would break the comparability with the actual experiments from the animal cognition literature.

terryzhao127 commented 5 years ago

It is very weird that the env.render() function does not work at all.

The codes (a file named test.py at the root of the repo) are:

from animalai.envs.gym.environment import AnimalAIEnv
from animalai.envs.arena_config import ArenaConfig

import random

env_path = 'env/AnimalAI'
worker_id = random.randint(1, 100)
arena_config_in = ArenaConfig('examples/configs/1-Food.yaml')
gin_files = ['examples/configs/rainbow.gin']

env = AnimalAIEnv(environment_filename=env_path,
                  worker_id=worker_id,
                  n_arenas=1,
                  arenas_configurations=arena_config_in,
                  retro=True)
env.reset()

done = False
while not done:
    env.render()
    action = env.action_space.sample()  # your agent here (this takes random actions)
    observation, reward, done, _ = env.step(action)

env.close()

My computer is Ubuntu 18.10 but I think it is not related to system.

When I use the debug of PyCharm, if I stop at some statement in while loop, then press Resume Program, the render() can work. Is it because the Python process is too fast to be caught by render system?

beyretb commented 5 years ago

The gym environment is purely a wrapper for the Unity ML Agent environment and allows to plug libraries as Dopamine directly without changing the code too much. Therefore, the logic behind it is a bit different from OpenAI gym and follows the Unity way of doing things.

You can see in the source, the env.render() function actually only returns the visual observations from the agent:

def render(self, mode='rgb_array'):
    return self.visual_obs

As mentioned above (and detailed on this issue) the Unity window rendered is not synchronised with the agent. If you wish to have a step by step visualisation of the environment you can use the visualisation as is done in examples/visualizeLightsOff.py and replace the ML Agents environment with the Gym one, and display the output of env.render().

As a side note, retrieving the outcome of a single step is the same as Gym though, the env.info attribute also contains the ML Agents brain.

terryzhao127 commented 5 years ago

@beyretb I am so sorry that I don't even know ML Agents. Does this file use matplotlib to render the game?

Another question plz: If I don't use any functionality of rendering, can the game be purely run by gym wrapper which I used in above codes.

beyretb commented 5 years ago

Hey, no worries, you don't need knowledge of ML Agents, the documentation should be enough to get started with the packages we provide.

The visualizeLightsOff.py script is purely a visualisation tool for you to see what the agent sees, and matplotlib is used for this purpose yes. It is not needed for training though.

Running the code above works fine, you can even remove the env.render() line and you will still get observations and rewards. Does that answer your question?

terryzhao127 commented 5 years ago

... Running the code above works fine, you can even remove the env.render() line and you will still get observations and rewards. Does that answer your question?

Thanks for your answer and now I know what to do.

beyretb commented 5 years ago

Closing this issue as better rendering to the agent is now available