Temporal and spatial depth filter in python

frankahland commented 3 months ago


librealsense	2.54.2 RELEASE
OS	Windows
Name	Intel RealSense L515
Firmware Version	1.5.8.1
Camera Locked	YES
Usb Type Descriptor	3.2
Product Line	L500
Asic Serial Number	0003cb08b607
Firmware Update Id	0003cb08b607

Hello,

i try to get the temporal and spatial depth filter up and running in python. I googled and found lots of discussions of people struggeling, tried a lot of coding and never got it to work. The image of the filtered and unfiltered looks identical.

Is there the one good and working example that i should follow? perfectly one py file that uses the filters and shows the result in different windows as images where you can see the difference.

Basically what i did is: In init: self.temporalDepthFilter = rs.temporal_filter() self.temporalDepthFilter.set_option(rs.option.filter_smooth_alpha, 0.4) self.temporalDepthFilter.set_option(rs.option.filter_smooth_delta, 20)

and then in the while loop: self.depth_frame = aligned_frames.get_depth_frame() filtered_depth_temporal = self.temporalDepthFilter.process(self.depth_frame) colorizer = rs.colorizer() depth_colormap_temporal = np.asanyarray(colorizer.colorize(filtered_depth_temporal).get_data()) cv2.imShow(depth_colormap_temporal

Thanks, Frank

MartyG-RealSense commented 3 months ago

Hi @frankahland https://github.com/IntelRealSense/librealsense/issues/1672#issuecomment-387438447 has a simple example of a Python script for applying temporal and spatial filters.

colorizer = rs.colorizer() should be outside of the while loop as it only needs to be called once. Typically it will be placed before the pipeline start line, such as directly after cfg stream configuration lines if a script has them.

I am not aware of a pre-existing Python example that shows post-processing in separate panels before and after application of the filters.

frankahland commented 3 months ago

Hallo Marty,

i added my implementation. It opens 4 cv2 windows:

RGB image
depth image
depth image with temporal filter
depth image with spatial filter

All 3 depth images show the exact same image. Do you have an idea what i doing wrong?

Thanks a lot Frank


import time

import cv2
import pyrealsense2 as rs
import numpy as np
from datetime import datetime
mousePointerXY = (400, 300)

def updateMousePosition(event, x, y, args, params):
    global mousePointerXY
    mousePointerXY = (x, y)

class RealsenseCamera:

    resolution_rgb = (1280,720)
    resolution_depth = (640,480)

    def __init__(self):
        # Configure depth and color streams
        print("Connected Intel Realsense Camera devices:")
        self.context = rs.context()
        for device in self.context.devices:
            print(f"Device {device.get_info(rs.camera_info.name)} connected to laptop")

        self.pipeline = rs.pipeline()

        config = rs.config()
        config.enable_stream(rs.stream.color, self.resolution_rgb[0], self.resolution_rgb[1], rs.format.bgr8, 30)
        config.enable_stream(rs.stream.depth, self.resolution_depth[0], self.resolution_depth[1], rs.format.z16, 30)  # also possible: 1024x768

        self.color_frame = None
        self.depth = None

        # Start streaming
        pipeline_profile = self.pipeline.start(config)
        align_to = rs.stream.color
        self.align = rs.align(align_to)

        # #Dieser  Filter  basiert  auf  einem  adaptiven  Filter, der  nicht  nur  den  aktuellen  Rahmen  verwendet, sondern  auch
        # #benachbarte  Frames  zur  Verbesserung  der  Bildqualität.Er  ist  besonders  nützlich, um  Rauschen  zu  reduzieren  und  die  Genauigkeit  der  Tiefenmessungen  zu  erhöhen.
        # Default values: https: // dev.intelrealsense.com / docs / post - processing - filters
        self.spatialDepthFilter = rs.spatial_filter()
        self.spatialDepthFilter.set_option(rs.option.filter_magnitude, 2)
        self.spatialDepthFilter.set_option(rs.option.filter_smooth_alpha, 0.5)
        self.spatialDepthFilter.set_option(rs.option.filter_smooth_delta, 20)

        # #Dieser Filter verwendet eine Sequenz von    Bildern, um    temporäre    Unstetigkeiten    zu    entfernen    und    die    Bildqualität    zu    verbessern.
        self.temporalDepthFilter = rs.temporal_filter()
        self.temporalDepthFilter.set_option(rs.option.filter_smooth_alpha, 0.4)
        self.temporalDepthFilter.set_option(rs.option.filter_smooth_delta, 20)

        #hole_filling = rs.hole_filling_filter()
        #depth_filtered = hole_filling.process(depth_frame)

        self.device = pipeline_profile.get_device()
        self.sensor = self.device.query_sensors()[1]  # 1 steht normalerweise für den Farbsensor
        self.colorizer = rs.colorizer()

    def get_frame_stream(self):
        # Wait for a coherent pair of frames: depth and color
        frames = self.pipeline.wait_for_frames()
        aligned_frames = self.align.process(frames)

        self.depth_frame = aligned_frames.get_depth_frame()
        self.color_frame = aligned_frames.get_color_frame()

        if not self.depth_frame or not  self.color_frame:
            # If there is no frame, probably camera not connected, return False
            print("Error, impossible to get the frame, make sure that the Intel Realsense camera is correctly connected")
            return False, None, None

        filtered_depth_spatial     = self.spatialDepthFilter.process(self.depth_frame)
        filtered_depth_temporal    = self.temporalDepthFilter.process(self.depth_frame)
        #filtered_frame_decimation =  self.decimation_filter.process(self.depth_frame)
        # hole_filling = rs.hole_filling_filter()
        # filled_depth = hole_filling.process(filtered_depth)

        # Create colormap to show the depth of the Objects
        depth_colormap          = np.asanyarray(self.colorizer.colorize(self.depth_frame).get_data())
        depth_colormap_spatial  = np.asanyarray(self.colorizer.colorize(filtered_depth_spatial).get_data())
        depth_colormap_temporal = np.asanyarray(self.colorizer.colorize(filtered_depth_temporal).get_data())

        #depth_image = np.asanyarray(filled_depth.get_data())
        color_image = np.asanyarray(self.color_frame.get_data())

        return True, color_image, depth_colormap, depth_colormap_temporal, depth_colormap_spatial

    def get_distance_from_rgbImagePixel(self, x, y, filterMode=0):
        if filterMode == 0:
            # Get the distance of a point in the image
            distance = self.depth_frame.get_distance(x, y)
        if filterMode == 1:
            distance = self.depth_frame.get_distance(x, y)
            #distance = self.temporalDepthFilter.process(self.depth_frame).get_distance(x, y)
        if filterMode == 2:
            distance = self.depth_frame.get_distance(x, y)
            #distance = self.spatialDepthFilter.process(self.depth_frame).get_distance(x, y)

        # convert to mm
        return round(distance * 1000, 2)

    def release(self):
        self.pipeline.stop()

def put_text_shadow(image, position, text):
    # Doppelt ausgeben für Schatten zur besseren lesbarkeit
    cv2.putText(image, text, (position[0] + 1, position[1] + 1 - 20), cv2.FONT_HERSHEY_SIMPLEX, 0.6, (255, 255, 255), 2, cv2.LINE_AA)
    cv2.putText(image, text, (position[0], position[1] - 20), cv2.FONT_HERSHEY_SIMPLEX, 0.6, (0, 0, 0), 2, cv2.LINE_AA)

def main():
    realSense = RealsenseCamera()
    cv2.namedWindow("rgb")
    cv2.setMouseCallback("rgb", updateMousePosition)

    while True:
        b, rgb_frame, depth_colormap, depth_colormap_temporal, depth_colormap_spatial = realSense.get_frame_stream()

        global mousePointerXY
        cv2.circle(rgb_frame, mousePointerXY, 4, (0, 0, 255))
        distance = realSense.get_distance_from_rgbImagePixel(mousePointerXY[0], mousePointerXY[1])
        put_text_shadow(rgb_frame, (mousePointerXY[0]+1, mousePointerXY[1]+1 - 20), "{}cm".format(distance,2))

        cv2.circle(depth_colormap, (int(mousePointerXY[0] / 1), int(mousePointerXY[1] / 1)), 4, (0, 0, 255))

        cv2.imshow("rgb",rgb_frame)
        cv2.imshow("Depth Image",depth_colormap)
        cv2.imshow("Depth Image spatial",depth_colormap_spatial)
        cv2.imshow("Depth Image temporal",depth_colormap_temporal )

        # #save current pic alle 1 sekunde
        # time.sleep(1.1)
        # current_time = datetime.now().strftime("%Y%m%d_%H%M%S")
        # filename = f"/temp/calibration/saved_image{current_time}.jpg"
        # print(filename)
        # cv2.imwrite(filename, color_image)

        key = cv2.waitKey(1)
        if key == 27:
            break

if __name__ == "__main__":
    main()

frankahland commented 3 months ago

real

Here is a screenshot of the 4 windows

MartyG-RealSense commented 3 months ago

Hi @frankahland It is recommended that depth-color alignment is applied after the list of post-processing filters is applied with their '.process' lines. So aligned_frames = self.align.process(frames) should be placed in the script somewhere after the last of the filters at the line below.

filtered_depth_temporal = self.temporalDepthFilter.process(self.depth_frame)

aligned_frames = self.align.process(frames)

Are the images in the windows moving streams or static images of a single frame? If it is a static frame then because the temporal filter generates its changes based on a history of previous frames, a single frame would not produce a noticeable change.

https://github.com/IntelRealSense/librealsense/issues/4468#issuecomment-513485662 advises that the spatial filter only provides a modest improvement in depth image quality.

If the windows have moving streams that are continually updating then you may see a more noticable change in the temporal filter image if you set its Filter Smooth Alpha to 0.1 instead of 0.4.

frankahland commented 3 months ago

Hello Marty,

i am streaming the video.

when i move the line aligned_frames = self.align.process(frames)

from here

        aligned_frames = self.align.process(frames)

        self.depth_frame = aligned_frames.get_depth_frame()
        self.color_frame = aligned_frames.get_color_frame()

        filtered_depth_spatial     = self.spatialDepthFilter.process(self.depth_frame)
        filtered_depth_temporal    = self.temporalDepthFilter.process(self.depth_frame)

to here (below the filter.process

        self.depth_frame = aligned_frames.get_depth_frame()
        self.color_frame = aligned_frames.get_color_frame()

        filtered_depth_spatial     = self.spatialDepthFilter.process(self.depth_frame)
        filtered_depth_temporal    = self.temporalDepthFilter.process(self.depth_frame)

        aligned_frames = self.align.process(frames)

then the methods self.depth_frame = aligned_frames.get_depth_frame() self.color_frame = aligned_frames.get_color_frame() do not have the aligned_frames to get their frames from. And they go into the temporalDepthFilter.process

Did i get it wrong?

Thanks, Frank

MartyG-RealSense commented 3 months ago

You did not get it wrong. Whilst it is recommended (not compulsory) that align is placed after the filters, there are sometimes projects that perform worse when doing so. In those cases, it is acceptable to place alignment before the filters. There is the risk though that distortion effects such as anti-aliasing (jagged lines) will occur.

frankahland commented 3 months ago

i solved it with an own implementation for an average depth.

i can configure how many frames are used for averaging pixels and how often the whole image is updated.

That's pefect for my application and the performance is good

import time

import cv2
import os
import json
import pyrealsense2 as rs
import numpy as np
from datetime import datetime

mousePointerXY = (400, 300)

from collections import deque

def put_text_shadow(image, position, text):
    # Doppelt ausgeben für Schatten zur besseren lesbarkeit
    cv2.putText(image, text, (position[0] + 1, position[1] + 1 - 20), cv2.FONT_HERSHEY_SIMPLEX, 0.6, (255, 255, 255), 2, cv2.LINE_AA)
    cv2.putText(image, text, (position[0], position[1] - 20), cv2.FONT_HERSHEY_SIMPLEX, 0.6, (0, 0, 0), 2, cv2.LINE_AA)

def getValueFromTrackbar(window_name, trackbar_name, initial_value, max_value, file_path='/temp/trackbar_values.json'):
    global trackbar_values
    # Load previous values if the file exists and is valid JSON
    if os.path.exists(file_path):
        try:
            with open(file_path, 'r') as file:
                trackbar_values = json.load(file)
        except (json.JSONDecodeError, IOError) as e:
            # If the file is empty or not a valid JSON, initialize with an empty dictionary
            print(f"Warning: Could not read file {file_path}. Error: {e}")
            trackbar_values = {}

    # Check if the trackbar already exists in the window
    trackbar_key = f"{window_name}_{trackbar_name}"
    if trackbar_key in trackbar_values:
        initial_value = trackbar_values[trackbar_key]
    else:
        trackbar_values[trackbar_key] = initial_value

    def on_trackbar_change(value, trackbar_key=trackbar_key):
        global trackbar_values
        trackbar_values[trackbar_key] = value
        with open(file_path, 'w') as file:
            json.dump(trackbar_values, file)

    # Add the trackbar if it doesn't already exist
    try:
        current_pos = cv2.getTrackbarPos(trackbar_name, window_name)
    except cv2.error:
        cv2.createTrackbar(trackbar_name, window_name, initial_value, max_value, lambda v: on_trackbar_change(v))

    return cv2.getTrackbarPos(trackbar_name, window_name)

class DepthImageFIFO:
    def __init__(self, max_size):
        self.queue = deque(maxlen=max_size)
        self.max_size = max_size

    def add_image(self, image):
        self.queue.append(image)

    def get_average_distance(self, x, y):
        if not self.queue:
            return 0
        x = int(x)
        y = int(y)
        # Berechne den Durchschnitt der Tiefenwerte an Punkt (x, y)
        sum_distance = 0
#        try:
        for image in self.queue:
            sum_distance += image[y, x]
        # except Exception as e:
        #     print("Exceltion get_average_distance", x,y,e, len(self.queue))
        average_distance = sum_distance / len(self.queue)
        return average_distance

    def get_average_image(self):
        if not self.queue:
            return 0

        # Berechne den Durchschnitt der Tiefenbilder
        sum_image = np.sum(np.array(self.queue), axis=0)
        average_image = sum_image / len(self.queue)

        # Durchschnittliche Entfernung an einer bestimmten Position (z.B. Mitte)

        return average_image

def updateMousePosition(event, x, y, args, params):
    global mousePointerXY
    mousePointerXY = (x, y)

class RealsenseCamera:
    resolution_rgb = (1280, 720)
    resolution_depth = (640, 480)

    def __init__(self, depthImageCounterDelayFrames=10):
        # Configure depth and color streams
        print("Connected Intel Realsense Camera devices:")
        self.context = rs.context()
        for device in self.context.devices:
            print(f"Device {device.get_info(rs.camera_info.name)} connected to laptop")

        self.pipeline = rs.pipeline()

        config = rs.config()
        config.enable_stream(rs.stream.color, self.resolution_rgb[0], self.resolution_rgb[1], rs.format.bgr8, 30)
        config.enable_stream(rs.stream.depth, self.resolution_depth[0], self.resolution_depth[1], rs.format.z16, 30)  # also possible: 1024x768

        self.color_frame = None
        self.depth = None

        # Start streaming
        pipeline_profile = self.pipeline.start(config)
        align_to = rs.stream.color
        self.align = rs.align(align_to)
        self.depthAverageQueueSite = 10
        self.avgDepthImageCounter = 0
        self.depthImageCounterDelayFrames = depthImageCounterDelayFrames

        self.fifoForAverageDepth = DepthImageFIFO(max_size=self.depthAverageQueueSite)  # FIFO mit maximaler Größe 10

        # FILTER ######################################
        # Der spatial FIlter ist sehr rechenaufwändig. Der temporalfilter macht, was ich nicht weiß.
        # Deshalb war der sinnvolle Weg einen eigenen Filter einzubauen, der einfach den Durchschnitt berechnet der letzten n Bilder, diese selber einfärbt.
        # Das Bild wird nur alle 15 Frames ein neuer Durchscnitt gebildet. Bei den Punkten für jeden Frame.

        # # #Dieser  Filter  basiert  auf  einem  adaptiven  Filter, der  nicht  nur  den  aktuellen  Rahmen  verwendet, sondern  auch
        # # #benachbarte  Frames  zur  Verbesserung  der  Bildqualität.Er  ist  besonders  nützlich, um  Rauschen  zu  reduzieren  und  die  Genauigkeit  der  Tiefenmessungen  zu  erhöhen.
        # # Default values: https: // dev.intelrealsense.com / docs / post - processing - filters
        # self.spatialDepthFilter = rs.spatial_filter()
        # self.spatialDepthFilter.set_option(rs.option.filter_magnitude, 2)
        # self.spatialDepthFilter.set_option(rs.option.filter_smooth_alpha, 0.3)
        # self.spatialDepthFilter.set_option(rs.option.filter_smooth_delta, 20)
        #
        # # #Dieser Filter verwendet eine Sequenz von    Bildern, um    temporäre    Unstetigkeiten    zu    entfernen    und    die    Bildqualität    zu    verbessern.
        # self.temporalDepthFilter = rs.temporal_filter()
        # self.temporalDepthFilter.set_option(rs.option.filter_smooth_alpha, 0.1)
        # self.temporalDepthFilter.set_option(rs.option.filter_smooth_delta, 20)

        # hole_filling = rs.hole_filling_filter()
        # depth_filtered = hole_filling.process(depth_frame)

        self.device = pipeline_profile.get_device()
        self.sensor = self.device.query_sensors()[1]  # 1 steht normalerweise für den Farbsensor
        #self.colorizer = rs.colorizer()

    def get_frame_stream_fixed_depth(self, min_depth, max_depth):
        # Wait for a coherent pair of frames: depth and color
        frames = self.pipeline.wait_for_frames()

        aligned_frames = self.align.process(frames)

        self.depth_frame = aligned_frames.get_depth_frame()
        self.color_frame = aligned_frames.get_color_frame()

        if not self.depth_frame or not self.color_frame:
            # If there is no frame, probably camera not connected, return False
            print("Error, impossible to get the frame, make sure that the Intel Realsense camera is correctly connected")
            return False, None, None

        # filtered_depth_spatial = self.spatialDepthFilter.process(self.depth_frame)
        # filtered_depth_temporal = self.temporalDepthFilter.process(self.depth_frame)
        # # filtered_frame_decimation =  self.decimation_filter.process(self.depth_frame)
        # # hole_filling = rs.hole_filling_filter()
        # # filled_depth = hole_filling.process(filtered_depth)

        # Create colormap to show the depth of the Objects
        #depth_colormap = np.asanyarray(self.colorizer.colorize(self.depth_frame).get_data())

        # colormap for filters i do not use
        # depth_colormap_spatial = np.asanyarray(self.colorizer.colorize(filtered_depth_spatial).get_data())
        # depth_colormap_temporal = np.asanyarray(self.colorizer.colorize(filtered_depth_temporal).get_data())

        # depth_image = np.asanyarray(filled_depth.get_data())
        color_image = np.asanyarray(self.color_frame.get_data())

        depth_image = np.asanyarray(self.depth_frame.get_data())
        self.fifoForAverageDepth.add_image(depth_image)
        if self.avgDepthImageCounter <=0: # nur jedes 10. Bild ein neues Durchscnittsbild berechnen. Die Pixel in get_average_distance werden mit jedem Frame berechnet
            self.average_image = self.fifoForAverageDepth.get_average_image()
            self.avgDepthImageCounter = self.depthImageCounterDelayFrames
        self.avgDepthImageCounter -= 1
        depth_colormap = self.colorize_depth_image(self.average_image, min_depth, max_depth)

        return True, color_image, depth_colormap

    def get_frame_stream_configurable_depth(self, windowName):
        # Begrenzung der Tiefenwerte vor der Normalisierung von Trackbars holen, die sich die Werte selber speichern
        min_depth  = getValueFromTrackbar(windowName, "DistanceMin",1400,4000)
        max_depth  = getValueFromTrackbar(windowName, "DistanceMax",3000,4000)
        return self.get_frame_stream_fixed_depth(min_depth, max_depth)

    def get_distance_from_rgbImagePixel_in_mm(self, x, y, averaged=True):
        if averaged:
            distance = self.fifoForAverageDepth.get_average_distance(x,y)
            distance_mult = distance * self.depth_frame.get_units()
            # convert to mm
            return round(distance_mult * 1000, 1)
        else:
            distance = self.depth_frame.get_distance(x, y)
            # convert to mm
            return round(distance * 1000, 1)

    def release(self):
        self.pipeline.stop()

    def colorize_depth_image(self, depth_image, min_depth, max_depth):

        # Begrenzung der Tiefenwerte vor der Normalisierung
        #Alle <min_depth werden zu min_depth und oben auch
        depth_image_clipped = np.clip(depth_image, min_depth, max_depth)

        # Normalisieren des Tiefenbildes für Visualisierung
        #normalisieren auf 0..255
        depth_image_normalized = cv2.normalize(depth_image_clipped, None, 0, 255, cv2.NORM_MINMAX, dtype=cv2.CV_8U)

        # Farbkarte anwenden
        depth_colored = cv2.applyColorMap(depth_image_normalized, cv2.COLORMAP_INFERNO)

        return depth_colored

def main():
    realSense = RealsenseCamera()
    cv2.namedWindow("rgb")
    cv2.setMouseCallback("rgb", updateMousePosition)

    # Setze die Helligkeit
    #realSense.sensor.set_option(rs.option.brightness, 10)  # Standardbereich: 0-100

    #cv2.namedWindow("DepthAVG", cv2.WINDOW_NORMAL)

    # Das Window muss schon existieren, wenn man get_frame_stream_configurable_depth aufruft, damit die Trackbars angelegt werden können
    window_name_DepthAVG = "DepthAVG"
    cv2.namedWindow(window_name_DepthAVG, cv2.WINDOW_NORMAL)

    queueAvg = deque(maxlen=realSense.depthAverageQueueSite)
    queueNonAvg = deque(maxlen=realSense.depthAverageQueueSite)

    while True:
        b, rgb_frame, depth_colormap = realSense.get_frame_stream_configurable_depth(window_name_DepthAVG)  # Zeigt Trackbars im Window an, das die depth anzeigt
        #b, rgb_frame, depth_colormap = realSense.get_frame_stream_fixed_depth(minDepth=2000, maxDepth=2500)  #man übergibt Werte für min und Maxdepth. Werte drunter oder drüber werden abgeschnitten

        global mousePointerXY
        #depth_avg_colormap = np.asanyarray(realSense.colorizer.colorize(average_image).get_data())

        cv2.circle(rgb_frame, mousePointerXY, 4, (0, 0, 255))
        distance_non_avg = realSense.get_distance_from_rgbImagePixel_in_mm(mousePointerXY[0], mousePointerXY[1], False)
        distance = realSense.get_distance_from_rgbImagePixel_in_mm(mousePointerXY[0], mousePointerXY[1])
        put_text_shadow(rgb_frame, (mousePointerXY[0] + 1, mousePointerXY[1] + 1 - 20), f"{distance}mm, non avg= {distance_non_avg}mm")

        queueAvg.append(distance)
        queueNonAvg.append(distance_non_avg)
        min_distance = min(queueAvg)
        max_distance = max(queueAvg)
        min_distance_nonavg = min(queueNonAvg)
        max_distance_nonavg = max(queueNonAvg)
        print(f"Distance last {realSense.depthAverageQueueSite} frames. AVG min/max = {min_distance}/{max_distance},  NON AVG min/max = {min_distance_nonavg}/{max_distance_nonavg}")

        cv2.circle(depth_colormap, (int(mousePointerXY[0] / 1), int(mousePointerXY[1] / 1)), 4, (0, 0, 255))

        cv2.imshow("rgb", rgb_frame)
        # Die Distance ford für jedes Frame berechnet. Das Durchschnittsbild nur alle n Frames
        cv2.imshow(window_name_DepthAVG, depth_colormap)

        # #save current pic alle 1 sekunde
        # time.sleep(1.1)
        # current_time = datetime.now().strftime("%Y%m%d_%H%M%S")
        # filename = f"/temp/calibration/saved_image{current_time}.jpg"
        # print(filename)
        # cv2.imwrite(filename, color_image)

        key = cv2.waitKey(1)
        if key == 27:
            break

if __name__ == "__main__":
    main()

MartyG-RealSense commented 3 months ago

It's great to hear that you achieved a solution. Thanks so much for sharing the code!

MartyG-RealSense commented 2 months ago

Case closed due to solution achieved and no further comments received.

IntelRealSense / librealsense

Temporal and spatial depth filter in python #13099