microsoft / HoloLensForCV

Sample code and documentation for using the Microsoft HoloLens for Computer Vision research
MIT License
474 stars 156 forks source link

Banding on depth output #19

Open MichealReed opened 6 years ago

MichealReed commented 6 years ago

When using the recorder to capture depth data we are seeing a circular image with significant banding. Any idea what's causing this? Since the API providing the image only returns frames and not raw data, we do not have much access to troubleshoot.

image

FracturedShader commented 6 years ago

I modified the "Compute on Device" sample to grab the LongThrowToFDepth frames and saw the same thing. The circular part seems to just be how it is captured, but the banding was a result of trying to fit the 16-bit depth onto the 8-bit texture. Rather than scaling the values down, it simply overflows several times. Sixteen exactly, actually. That's important later.

Creating a texture that uses the DXGI_FORMAT_R16_UNORM format, and making sure that the wrapped OpenCV image gets copied properly is the way to get a nice smooth depth map. Right now OpenCVHelpers::CreateOrUpdateTexture2D forces 4x8-bit formats, but if you change line 38 in Shared/OpenCVHelpers/OpenCVTexture2D.cpp to image.type(), instead of CV_8UC4 it will simply copy whatever format the image is using. Granted, doing that results in a faded red texture, so now you need to open Shared/Rendering/SlateMaterial.Default.ps.hlsl and change line 41 to return min16float4(sampledColor.rrr * 16.0f, 1.0f); to make it look right. We have to multiply by 16 because the HoloLens does not fill the top WORD of the depth information, so it will never exceed 1/16th of the normalized [0-1] range.

Hope that helps!

jiying61306 commented 6 years ago

I met the same problem. The problem is endian of pixel. For example The pgm raw data of first pixel in .pgm file is 0xff0f, and this number can be 65295 or 4095 in base ten. 65295 is wrong, 4095 is right.

Huangying-Zhan commented 5 years ago

In case people still have questions with this banding issue, @jiying61306 has pointed out the problem. It is about endian of pixel. Basically you need to reverse the encoding from e.g. 0x ff 0f to 0x 0f ff. Here is an example code I used to get the correct values from the original .pgm file.

def pgm2distance(img, encoded=False):
    distance = np.zeros((img.shape))
    for y in range(img.shape[0]):
        for x in range(img.shape[1]):
            val = hex(img[y,x])
            val = str(val[2:]).zfill(4)        
            val = val[-2:]+val[:2]
            distance[y,x] = float(int(val,16))/1000.0
    return distance

Also, please notice that the data captured is actually distance, not depth, (Z). You may need to do some processings in order to get real depth. Please refer to here for the details.

Hope this help!

cyrineee commented 5 years ago

@Huangying-Zhan in which file did u do the modification ? i want to know the distance of the object . Thanks in advance