Samples don't cover video with depth output

SSKKBrianG commented 4 years ago

There is an example of how to get a single depth jpeg, but not how to do this with video.

Specifically, with a device that supports REQUEST_AVAILABLE_CAPABILITIES_DEPTH_OUTPUT and DEPTH_DEPTH_IS_EXCLUSIVE, it is unclear how to actually get the depth information. The color and depth requests will be interleaved somehow and the depth will be either DEPTH16 or DEPTH_POINT_CLOUD. But what this looks like in code is a mystery.

owahltinez commented 4 years ago

@SSKKBrianG there are some code snippets in the documentation, for example: https://developer.android.com/reference/android/graphics/ImageFormat#DEPTH_POINT_CLOUD

I don't know of any devices that support either DEPTH16 or DEPTH_POINT_CLOUD as output, which is why making a sample for it has been pretty low priority. Keep in mind too that video in particular may not be possible to do, since there are no encoders that accept DEPTH16 or DEPTH_POINT_CLOUD pixel format so you would have to convert from that pixel format to a (grayscale?) bitmap yourself before encoding the video.

The good news is that, if you want to give it a try, the new emulator running Android Q should support DEPTH16 pixel format!

SSKKBrianG commented 4 years ago

Thank you for the response.

In my case, I don't have an Image, so I don't believe that code snippet will apply.

I am using the CameraDevice to create a CameraCaptureSession that is connected to a SurfaceTexture. I begin a repeating request that makes each camera image available through onFrameAvailable , which then schedules a updateTexImage so it can be rendered on the screen.

I have a device that supports REQUEST_AVAILABLE_CAPABILITIES_DEPTH_OUTPUT and has the flag DEPTH_DEPTH_IS_EXCLUSIVE.

Note that the color/depth textures are intended to be used live and not through an image or video format.

owahltinez commented 4 years ago

I don't think that what you are proposing would work. DEPTH16 and DEPTH_POINT_CLOUD are data formats, not intended for displaying without manipulation. I would be surprised if TextureView knew what to do with the pixels, since they have coordinates but they don't have a "color" associated with them.

My recommendation to work with depth-related data is to subscribe to the capture callback and parse the image data. You can then build a grayscale bitmap if you wish to display something to users.

SSKKBrianG commented 4 years ago

Just to clarify, I'm not trying to display the depth texture.

I'm already subscribed to the capture callback as described above. I just don't understand how to get the depth data.

I tried the following, but I'm not sure if it is correct.

val colorSurface = Surface(cameraColorSurface)
val depthSurface = Surface(cameraDepthSurface)
val previewRequestBuilder = cameraDevice.createCaptureRequest(CameraDevice.TEMPLATE_RECORD)
previewRequestBuilder.addTarget(colorSurface)
previewRequestBuilder.addTarget(depthSurface)
cameraDevice.createCaptureSession(listOf(colorSurface, depthSurface), object : CameraCaptureSession.StateCallback()

Is the second surface for the depth? Is this also created as GL_TEXTURE_EXTERNAL_OES?

SSKKBrianG commented 4 years ago

I didn't have any luck going directly to a SurfaceTexture. I suspect this is because the format isn't correct.

I reviewed the Camera2Basic code again and adapted the ImageReader code to work on the live stream. So now it looks something like this.

val colorSurface = Surface(cameraColorSurface)
val depthReader = ImageReader.newInstance(width, height, ImageFormat.DEPTH16, 1)
depthReader.setOnImageAvailableListener({ reader ->
    val image = reader.acquireNextImage()
    image.close()
}, cameraDepthHandler)
val depthSurface = depthReader.surface
val previewRequestBuilder = cameraDevice.createCaptureRequest(CameraDevice.TEMPLATE_PREVIEW)
previewRequestBuilder.addTarget(colorSurface)
previewRequestBuilder.addTarget(depthSurface)
cameraDevice.createCaptureSession(listOf(colorSurface, depthSurface), object : CameraCaptureSession.StateCallback()

I guess from here I need to render the depth information to an GLES depth texture (which is what the SurfaceTexture is supposed to do).

Does this sound correct?

owahltinez commented 4 years ago

Just to clarify, I'm not trying to display the depth texture.

Then why are you putting it in a SurfaceTexture? Textures are generally used for displaying things, I would recommend using a plain Surface if all you want is to analyze incoming frames. It's entirely possible that the SurfaceTexture does not know what to do with the depth data, and that's why you are not able to make it work.

I reviewed the Camera2Basic code again and adapted the ImageReader code to work on the live stream. So now it looks something like this.

Yes, that looks roughly correct -- that's how you would get an Image object and then you can do as the documentation suggests. Let me know if you run into any other issues going this route.

I think I now understand what you are trying to do a little bit better. I might have a device which supports DEPTH16 image format, if I can get my hands on it then I will try to add a sample to this repo.

SSKKBrianG commented 4 years ago

What I ultimately need is two GLES textures. One with color information and the other depth. The two are processed together by shader code and rendered to a GLSurfaceView.

The reason for the SurfaceTexture is to bridge between the camera image and a GLES texture, since it handles all the details of taking the CPU side image and transferring it to the GPU for each frame.

But as you said, it appears the SurfaceTexture does not know how to handle depth data. I'll continue on the path of the ImageReader for now.

Thank you for the help!

SSKKBrianG commented 4 years ago

Using a Huawei P30 Pro, I've add the code to acquire the depth data and apply it to a GLES texture. To debug, I'm rendering this texture. It's not what I expect.

In the image below, I have 3 screenshots.

On the left is when I attempted to treat the data according to the DEPTH16 spec. I created a texture with an internal format of GL_R16UI, a format of GL_RED_INTEGER and type of GL_UNSIGNED_SHORT, using image.planes[0].buffer.asShortBuffer() for the buffer. Removing the asShortBuffer() didn't make any difference.

In the center, I instead used an internal format of GL_R8, a format of GL_RED, a type of GL_UNSIGNED_BYTE, and image.planes[0].buffer.

The right is the same as the center, except I apply a 0x1FFF mask to the data. This seemed to darken alternating lines.

According to the Image instance, the width is 992, the height is 744, the pixel stride is 2, the row stride is 1,984, the byte count is 1,476,096, and the format is DEPTH16.

My observation is that although the image is formatted correctly as DEPTH16, it appears the actual data is 8-bit. But it doesn't really look like a depth map. For example, the letters on the keyboard are at a different depth than the keys. The depth of the keys are all over the place. It doesn't seem very useful.

Any ideas?

Depth1

SaibotC commented 4 years ago

Using a Huawei P30 Pro, I've add the code to acquire the depth data and apply it to a GLES texture. To debug, I'm rendering this texture. It's not what I expect.

In the image below, I have 3 screenshots.

On the left is when I attempted to treat the data according to the DEPTH16 spec. I created a texture with an internal format of GL_R16UI, a format of GL_RED_INTEGER and type of GL_UNSIGNED_SHORT, using image.planes[0].buffer.asShortBuffer() for the buffer. Removing the asShortBuffer() didn't make any difference.

In the center, I instead used an internal format of GL_R8, a format of GL_RED, a type of GL_UNSIGNED_BYTE, and image.planes[0].buffer.

The right is the same as the center, except I apply a 0x1FFF mask to the data. This seemed to darken alternating lines.

According to the Image instance, the width is 992, the height is 744, the pixel stride is 2, the row stride is 1,984, the byte count is 1,476,096, and the format is DEPTH16.

My observation is that although the image is formatted correctly as DEPTH16, it appears the actual data is 8-bit. But it doesn't really look like a depth map. For example, the letters on the keyboard are at a different depth than the keys. The depth of the keys are all over the place. It doesn't seem very useful.

Any ideas?

Any chance you got this to work?

Additionally, have you tested with pixel 4? That phone could be more suitable for a depth stream. You can find examples of pixel 4 depth maps here: https://ai.googleblog.com/2020/04/udepth-real-time-3d-depth-sensing-on.html?m=1

And in the article, Google mentions having given everyone access to this new uDepth api in Camera2.

SSKKBrianG commented 4 years ago

Thank you for the article, @SaibotC. I wasn't able to get beyond what I posted for the Huawei P30 Pro at this point.

odieXin commented 4 years ago

@owahltinez @SaibotC @SSKKBrianG Did anyone of you try depth preview on Pixel 4? We found that when depth preview is rendered together with the YUV preview, then the view freezes after about 2min and shows the below warning messages: W/native: ultradepth_pipeline_manager.cc:63 An internal error occurred: generic::internal: Thermal throttling of computation resources.

Any idea?

reyricoy commented 4 years ago

Hi @SSKKBrianG,

I am currently trying to do the same thing that you, did you finaly managed to do it ? I have a Mate 30 pro and an external and plugable TOF sensor, I would like to acces to the TOF Data of The Mate 30 pro and then change or complement the Depth data with my external one..

SSKKBrianG commented 4 years ago

No, I didn't get anything beyond the image that I posted back in April.

owahltinez commented 3 years ago

Unfortunately, we will not be able to cover this use case in the near future, so we have to close this issue without resolution 👎

If you have any specific questions about depth video output, please use the android-camera tag in StackOverflow.

android / camera-samples

Samples don't cover video with depth output #195