Improve qualtity of depth reading

FerdiReh commented 1 year ago

Required Info
Camera Model	D405
Firmware Version	05.13.00.50
Operating System & Version	Win 10
Kernel Version (Linux Only)	)
Platform	PC
SDK Version	2.51.1 }
Language	Python
Segment	others }

Issue Description

Hello, I'm currently trying to track the movement of a human and calculate the total distance withe the help of a D405 camera. My issue is, that the depth information of the tracked point (e.g. left hand, nose...) is not quiet accurate, even if the point isn't moving. One example of a point which isnt in movement would be:

[...] [25, 43] depth_image[y, x] / 10000 = 0.3273 m, depth_frame.get_distance(x, y) = 0.3026999831199646 m [25, 44] depth_image[y, x] / 10000 = 0.3035 m, depth_frame.get_distance(x, y) = 0.3026999831199646 m [25, 45] depth_image[y, x] / 10000 = 0.3053 m, depth_frame.get_distance(x, y) = 0.3026999831199646 m [25, 46] depth_image[y, x] / 10000 = 0.2972 m, depth_frame.get_distance(x, y) = 0.2971999943256378 m [25, 47] depth_image[y, x] / 10000 = 0.3104 m, depth_frame.get_distance(x, y) = 0.2971999943256378 m [25, 48] depth_image[y, x] / 10000 = 0.3339 m, depth_frame.get_distance(x, y) = 0.33390000462532043 m [25, 49] depth_image[y, x] / 10000 = 0.3031 m, depth_frame.get_distance(x, y) = 0.33390000462532043 m [25, 50] depth_image[y, x] / 10000 = 0.3167 m, depth_frame.get_distance(x, y) = 0.3166999816894531 m [25, 51] depth_image[y, x] / 10000 = 0.3171 m, depth_frame.get_distance(x, y) = 0.31709998846054077 m [25, 52] depth_image[y, x] / 10000 = 0.3065 m, depth_frame.get_distance(x, y) = 0.30649998784065247 m [25, 53] depth_image[y, x] / 10000 = 0.3335 m, depth_frame.get_distance(x, y) = 0.3334999978542328 m [25, 54] depth_image[y, x] / 10000 = 0.3308 m, depth_frame.get_distance(x, y) = 0.3334999978542328 m [25, 55] depth_image[y, x] / 10000 = 0.3107 m, depth_frame.get_distance(x, y) = 0.310699999332428 m [25, 56] depth_image[y, x] / 10000 = 0.3251 m, depth_frame.get_distance(x, y) = 0.310699999332428 m [25, 57] depth_image[y, x] / 10000 = 0.3107 m, depth_frame.get_distance(x, y) = 0.310699999332428 m [25, 58] depth_image[y, x] / 10000 = 0.3213 m, depth_frame.get_distance(x, y) = 0.3212999999523163 m [25, 59] depth_image[y, x] / 10000 = 0.2926 m, depth_frame.get_distance(x, y) = 0.29260000586509705 m [...]

The real distance is 0.305 m and due to the change in depth the resulting real world coordinates for x and y jump aswell which is resulting in a pretty significant traveled distance:

grafik

In my current code I:

Create the pipeline and start streaming out of a recorded .bag -> Apply the spatial and hole filler filter -> Apply skeleton detection and calculate the depth and real world coordinates of the point of interest

Are there any more filters I could apply? There aren't any objects with a distance > 1 m in the cameras field of view and I haven't implemented the decimation filter, so the points of interest have the same pixel coordinates in both images.

Thanks you very much in advance!

Code up to pose detection:

import numpy as np
import matplotlib.pyplot as plt
import pyrealsense2 as rs  # Intel RealSense cross-platform open-source API
import cv2
import mediapipe as mp
from mpl_toolkits import mplot3d
import time

# Path of the saved .bag file
pth = "22_12_14_V3.bag"
# pth = "Duomessung1_VID.bag"

def getCoordinates(depth_frame, px, py, depth):
    depth_intrin = depth_frame.profile.as_video_stream_profile().intrinsics
    x, y, z = rs.rs2_deproject_pixel_to_point(depth_intrin, [px, py], depth)
    return x, y, z

# Create pipeline
pipeline = rs.pipeline()

# Create a config object
config = rs.config()

# Tell config that we will use a recorded device from file to be used by the pipeline through playback.
rs.config.enable_device_from_file(config, pth)

# Configure the pipeline to stream the depth stream
# Change these parameters according to the recorded bag file resolution
config.enable_stream(rs.stream.depth, rs.format.z16, 30)
config.enable_stream(rs.stream.color, rs.format.rgb8, 30)

# Start streaming from file
profile = pipeline.start(config)

# Create colorizer and spatial object
colorizer = rs.colorizer()
spatial = rs.spatial_filter()

# Sceleton detection
# initialize Pose estimator
mp_drawing = mp.solutions.drawing_utils
mp_pose = mp.solutions.pose

pose = mp_pose.Pose(
    min_detection_confidence=0.5,
    min_tracking_confidence=0.5)

count = 0

# ----------------------------- End of initialising -----------------------------
# ----------------------------- Start of image processing -----------------------

while True:
    # Get frameset of depth
    frames = pipeline.wait_for_frames()
    # Get depth frame
    depth_frame = frames.get_depth_frame()
    color_frame = frames.get_color_frame()

    # Holefiller algorithm for depth image
    spatial.set_option(rs.option.holes_fill, 3)

    # Options for spatial filter
    spatial.set_option(rs.option.filter_magnitude, 5)
    spatial.set_option(rs.option.filter_smooth_alpha, 1)  # alpha el. [0.25, 1] (high to low smoothing)
    spatial.set_option(rs.option.filter_smooth_delta, 50)
    filtered_depth = spatial.process(depth_frame)
    colorized_depth = np.asanyarray(colorizer.colorize(filtered_depth).get_data())

    # Colorize depth frame to jet colormap
    depth_color_frame = colorizer.colorize(depth_frame)

    # Convert depth_frame to numpy array to render image in opencv
    depth_image = np.asanyarray(filtered_depth.get_data())
    color_image = np.asanyarray(color_frame.get_data())

MartyG-RealSense commented 1 year ago

Hi @FerdiReh I would recommend first exploring whether the depth reading inaccuracy may be related to the smoothness of skin and lack of detailed texture on those skin surfaces, as discussed recently at the link below.

https://support.intelrealsense.com/hc/en-us/community/posts/11888301109651

FerdiReh commented 1 year ago

Hello @MartyG-RealSense, thank you for your quick respond! That is a really helpfull link and I'll be looking into the problem with the D405 in more detail.

In the following I printed out the measured distances of a rough plaster that is applied to the skin: [...] distance = 24.64 cm distance = 25.04 cm distance = 24.99 cm distance = 25.19 cm distance = 24.88 cm distance = 25.01 cm distance = 24.76 cm distance = 24.76 cm distance = 25.01 cm distance = 24.86 cm distance = 24.96 cm distance = 25.06 cm distance = 25.06 cm distance = 25.19 cm distance = 24.76 cm distance = 24.93 cm distance = 24.86 cm distance = 24.76 cm distance = 24.86 cm distance = 24.71 cm distance = 24.93 cm [...]

It seems like the measurment might be a bit better, but I'm still surprised about the variance. Is this common for this camera type or could there be another issue with my code?

Thank you very much!

EDIT: I just tested the camera on another rougher surface in 20 cm distance and the variance here was only +- 2 mm, which seems quiet good to me. Do you have any recommendations besides the projector mentioned in your link, when it comes to smooth surfaces? The projector unfortunatlsy isn't an option in our environment.

MartyG-RealSense commented 1 year ago

If a RealSense 400 Series camera is not equipped with an infrared dot pattern projector (the D405 lacks one) or has the projector disabled then the camera can alternatively use ambient light in the scene to analyze surfaces for depth detail. So increasing the strength of the illumination cast onto the skin may help if possible for you to do.

FerdiReh commented 1 year ago

Alright, I'll try and increase the illumination in our set-up. Really appreciate your quick respond and help :)

FerdiReh commented 1 year ago

Hey @MartyG-RealSense, sorry for opening up the discussion again but I have a quick question about the D435. Would you say that the depth measurements of this model are more consistent in direct comparison to the D405? Thanks in advance!

MartyG-RealSense commented 1 year ago

The D405 is designed for high quality, high accuracy images at close range (ideal depth sensing range 7 cm to 50 cm), whilst the D435 is suited to a wider range of applications due to its 10 meter maximum range but is not as good as D405 at very close range.

For example, the default minimum depth sensing range of D435 is around 10 cm. Whilst it may be possible to achieve the D405's 7 cm minimum using the Disparity Shift setting to reduce D435's minimum distance, the images at that range will not be as good as D405 because D435 is designed for performance over long-range.

FerdiReh commented 1 year ago

The depth readings of the D405 in my case (distance 30 - 50 cm) aren't as consistent, as I hoped. Meaning they vary +- 2cm over time. In your previous answer you said, that this could be due to the build in infrared filter. So I'd like to know if you think, that the depth readings on skin could be more consistent, when using the D435.

MartyG-RealSense commented 1 year ago

The D435, which has a built in IR dot pattern projector and does not have the D405's IR Cut sensor filters that prevent it from perceiving IR dot patterns, could be helpful.

You could though first try setting a post-processing temporal filter on your D405 with a 'Filter Smooth Alpha' value of 0.1 (instead of the default 0.4) to stabilize fluctuations in the depth reading.

IntelRealSense / librealsense

Improve qualtity of depth reading #11219

Issue Description