Visual observations with float precision

dlindmark commented 4 years ago

Is your feature request related to a problem? Please describe. You have added uncompressed visual observations. But as I understand the data are still uint8?. We are using a camera rendering only the depth buffer. The properties of this depth map is similar to an regular image, (local features that might be learnt most efficiently using cnn). But a difference is that the precision is likely more important. Depending on where we place our near and far planes the resolution with uint8 differs alot.

Describe the solution you'd like Possibility to chose float precision on CameraSensorComponent. That should effectivly change the texture format that the image is read into.

Describe alternatives you've considered Render to RenderTexture with correct format and read this data and use it as vector_observation. But I like to experiment with cnns and see if we can get any performance gains from this.

P.s. Please share if anyone have any papers where cnn is used with depth camera data.

chriselion commented 4 years ago

You have added uncompressed visual observations. But as I understand the data are still uint8?.

It depends on how you're using it. If you create your own ISensor that uses SensorCompressionType.None and produces float "visual" (len(shape) == 3) observations, no additional quantization will be performed. We have a simple example of this here https://github.com/Unity-Technologies/ml-agents/blob/7507a5d3f5515ae7877eb3a3e9abaa2e2a270930/com.unity.ml-agents/Tests/Editor/Sensor/FloatVisualSensorTests.cs#L6 and it should be straightforward to extend to more channels. But you're correct that setting the CompressionType on a CameraSensor to None will still force things to uint8 first.

I'll admit that I know very little about rendering in Unity (and I don't think anyone else on the team is an expert either), so you might need to walk me through this a bit :) It looks like we'd need to change the texture format here: https://github.com/Unity-Technologies/ml-agents/blob/7507a5d3f5515ae7877eb3a3e9abaa2e2a270930/com.unity.ml-agents/Runtime/Sensors/CameraSensor.cs#L128 to either RFloat or RGBAFloat depending on the grayscale flag.

Then when converting the texture to an observation, we'd need to change this part of the code https://github.com/Unity-Technologies/ml-agents/blob/7507a5d3f5515ae7877eb3a3e9abaa2e2a270930/com.unity.ml-agents/Runtime/Utilities.cs#L31 to use GetPixels() (and obviously not divide by 255). Does that cover it? Anything I'm missing?

BTW, do you know of any examples of how to set up rendering using the depth buffer? It's come up a few times before so it might be good for us to have an example of it.

dlindmark commented 4 years ago

Sorry for late reply. I had a couple of fires to put out.

I am no unity developer either. Have used Unity just as much as ml-agents :) But I think that your suggestions would be enough.

For rendering depth buffer I refer to a shader in ML-ImageSynthesis repo. With that shader it is possible to render depth buffer, normals and or semantic segmented images.

The shader uses the macro COMPUTE_DEPTH_01. More similar macros exists here https://docs.unity3d.com/Manual/SL-DepthTextures.html .

Have you started work on this?/ Do you have time for it? I am wondering if I should try to make a contribution.

chriselion commented 4 years ago

Sorry for the delay - no-meeting Monday -> all-meeting Tuesday.

We haven't started working on this, and to be honest it's not at the top of our list right now (especially since you already have a test case set up). If you want to try it out and submit a PR, that would be great.

I think the parts of the code I linked to before should be enough for you to get started; you basically should be able to

pick the TextureFormat for Texture2D based on compression type and the Camera's RenderTexture (or add a new flag to toggle between the current behavior and full float precision)
extend (or duplicate) TextureToTensorProxy as needed to get the values as floats

dlindmark commented 4 years ago

Ok, I overestimated the time I had in Mars...

But should have some time to do this now. If not anyone else have started on it?

In addition to what @chriselion mentioned above I am thinking about setting up a simple example with an agent using only a depth camera to train. Maybe a new scene in the Wall Jump example environment?

github-actions[bot] commented 1 year ago

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

Unity-Technologies / ml-agents

Visual observations with float precision #3658