allenai / ai2thor

An open-source platform for Visual AI.
http://ai2thor.allenai.org
Apache License 2.0
1.16k stars 217 forks source link

Get the whole 3d scene state (from a global POV) #961

Open nikita-petrashen opened 2 years ago

nikita-petrashen commented 2 years ago

Hi! Is there any way to access the whole state of the scene through ai2thor Python interface? For example, in case of an agent which sees the scene from a global perspective, from which the scene is fully observable as a 3D structure.

If it's not possible now, could you please hint how this can be implemented?

Thanks!

mattdeitke commented 2 years ago

Hi @nikita-petrashen,

I'd recommend adding a 3rd party camera to the scene: https://ai2thor.allenai.org/ithor/documentation/environment-state/#add-camera.

To add a top-down camera to the scene, I'd recommend checking out this notebook: https://colab.research.google.com/drive/1GSIF78B62hNskyr-SYtUMgMdsZ7tJpJQ?usp=sharing

nikita-petrashen commented 2 years ago

Thanks for your reply! Sadly, this is not what I need. I'd like to be able to run the whole scene through a point cloud segmentation model. Is it possible to access mesh data of the scene through ai2thor or some sort of a Unity script? Thanks!

nikita-petrashen commented 2 years ago

I've ran trough the code of ai2thor but couldn't figure out where the 3d mesh data is converted to 2d renders (RGB, depth, semantic masks etc.).

mattdeitke commented 2 years ago

Thanks for your reply! Sadly, this is not what I need. I'd like to be able to run the whole scene through a point cloud segmentation model. Is it possible to access mesh data of the scene through ai2thor or some sort of a Unity script? Thanks!

Ahh, I see. We don't currently support returning the whole mesh of the scene from Unity. I'm not sure how easy this would be to support, but adding in @AlvaroHG, who might know the most on this.

I've ran trough the code of ai2thor but couldn't figure out where the 3d mesh data is converted to 2d renders (RGB, depth, semantic masks etc.).

This is done inside of Unity with cameras and shaders.

nikita-petrashen commented 2 years ago

Great, thanks! Could you please point out the place in the C# code where it happens? I believe it's somewhere inside the unity folder in the repo, but couldn't find where it happens.