Depth/opticalFlow/object segmentation images as visual observation

Unity-Technologies / ml-agents

The Unity Machine Learning Agents Toolkit (ML-Agents) is an open-source project that enables games and simulations to serve as environments for training intelligent agents using deep reinforcement learning and imitation learning.

https://unity.com/products/machine-learning-agents

Other

16.93k stars 4.14k forks source link

Depth/opticalFlow/object segmentation images as visual observation #2644

Closed maystroh closed 4 years ago

maystroh commented 4 years ago

For visual observation, the code supports rendered image(s) from single/multiple cameras, but there is no direct way to get the depth maps. Two issues were closed (#329 and #562) requesting the depth feature. I'm wondering if it is done or not? In addition, I think depth image alone won't be enough to train an agent, rather, it would be great to have the frame stack feature that DeepMind used in the atari games (4 stacked grayscale images).

For the first part, I already did an environment that can generate depth/opticalFlow/object segmentation images along with RGB image... something similar to this project but still not sure how can I use these images to train an agent using ml-agents.. I'm looking for hints/tips that can help me understand more what are the things that should be modified in ml-agents to stack multiple images of different type.

Thanks in advance.

chriselion commented 4 years ago

Hi @maystroh, There hasn't been any additional work on this since the previous requests. Someone else asked something similar the other day: https://github.com/Unity-Technologies/ml-agents/issues/2634 (I found the same project that you linked to). As mentioned there, if you can get your results into a RenderTexture, ML Agents will convert it to a bitmap for visual observations.

maystroh commented 4 years ago

@chriselion , Thanks for your reply. Actually, I do have multiple cameras that generate a list of images (one for segmentation, one for depth etc...) is there anyway to stack them during the training? (considering that I'm able to get my results into a RenderTexture)

chriselion commented 4 years ago

There's no support for stacking right now. You'd probably need to do some hacking around how we set up the inputs here https://github.com/Unity-Technologies/ml-agents/blob/9370b635cb52320e0d3b73829d239f1273929021/ml-agents/mlagents/trainers/models.py#L146 and something similar when we read the protobuf and convert to np arrays here: https://github.com/Unity-Technologies/ml-agents/blob/9370b635cb52320e0d3b73829d239f1273929021/ml-agents-envs/mlagents/envs/brain.py#L188-L194

chriselion commented 4 years ago

I've added the request for stacking visual observations to our internal tracker with the ID MLA-52. I’m going to close this issue for now, but we’ll ping back with any updates.

github-actions[bot] commented 3 years ago

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.