microsoft / HoloLensForCV

Sample code and documentation for using the Microsoft HoloLens for Computer Vision research
MIT License
472 stars 156 forks source link

Converting raw depth data into depth image in meters #37

Closed felixvh closed 6 years ago

felixvh commented 6 years ago

Hi all,

Using HoloLensForCV I am able to get a raw short throw depth image as .pgm file. The pgm file just contains the 16 bit data. The next is to convert this to an actual depth image with distances in meters. Does someone know how to do this?

Additionally, I noticed that in some cases distances are measured wrong. In the image below you can see my two screens with the wall above. Black means closer than white. So in these case the wall appears to be closer than the screens which is actually wrong. Does someone have an explanation for that?

test2

Best regards Felix

FracturedShader commented 6 years ago

The depth cameras use infrared sensors, so it's not perfect. Depending on what the material is made of, it can cause some issues, as you can tell by the keyboard disappearing.

In order to get the raw depth data into a mesh, you need to do a lot of things. I have been accessing the streams myself, so I'm not sure where in the files this lives, but you will need the range scalar for the depth (should be 0.001), the unproject map for every pixel on the depth image, the inverse of the view matrix of the particular depth camera, as well as the transformation from the camera's space to world space.

The process then works as follows: for every pair of floats in the unproject map, create a Vector3 depthCameraPoint = new Vector3(unproject[pixelIndex, 0], unproject[pixelIndex, 1], 1.0f);, normalize it, then depthCameraPoint *= depthMap[pixelIndex] * 0.001. Be aware, you may want to discard values over 0x0FF0 since that's roughly the cutoff in the data (had to guess around a bit). Once you have the depthCameraPoint setup, it's time to get it into world space. To do that you simply do depthToWorld * depthViewInverse * depthCameraPoint. Both depthToWorld and depthViewInverse are matrices that should be given to you (you have to invert the depth view yourself). Again, I'm not sure what this thing actually records and provides. In addition, depending what library you're using to do the multiplication, you may have to transpose the matrices first, since they have the translations at the bottom of the matrix instead of at the right.

Hope that helps!

felixvh commented 6 years ago

Thanks for the info! I am now able to save the bitmap in meters - I found a useful blog addtionally (https://mtaulty.com/)

As far as I understood you, I basically have to do what is written here : https://docs.microsoft.com/en-us/windows/mixed-reality/locatable-camera#images-with-coordinate-systems

The thing is that the CameraProjectionTransform (you called it unproject map) is always null. So basically I'm stuck there. If anyone knows how to get that values, let me know!