Loss of quality in depth data when saving videos in ROSBAG or .npy frames

Before opening a new issue, we wanted to provide you with some useful suggestions (Click "Preview" above for a better view):
- Consider checking our ROS RealSense Wrapper documentation README.
- Have you looked in our Discussions?
- Try searching our GitHub Issues (open and closed) for a similar issue.
All users are welcomed to report bugs, ask questions, suggest or request enhancements and generally feel free to open new issue, even if they haven't followed any of the suggestions above :)

Required Info
Camera Model	D435i
Operating System & Version	Ubuntu 120

ROS Distro	Humble

Issue Description

#!/usr/bin/env python

import os
import subprocess
import time
import pyrealsense2 as rs
import cv2
import numpy as np
import rospy
from sensor_msgs.msg import Image
from datetime import datetime
from cv_bridge import CvBridge

def get_mocap_topics():
    """List all topics under /mocap_node."""
    topics = subprocess.check_output(['rostopic', 'list']).decode('utf-8').split('\n')
    mocap_topics = [topic for topic in topics if topic.startswith('/mocap_node/')]
    return mocap_topics

def record_rosbag(topics, output_name):
    """Record the specified ROS topics into a rosbag."""
    command = ['rosbag', 'record', '-O', output_name] + topics
    return subprocess.Popen(command)

def setup_realsense():
    """Set up RealSense camera.""" 
    pipeline = rs.pipeline()
    config = rs.config()
    config.enable_stream(rs.stream.depth, 1280, 720, rs.format.z16, 30)
    config.enable_stream(rs.stream.color, 1280, 720, rs.format.bgr8, 30)
    pipeline.start(config)
    return pipeline

def record_realsense(pipeline, output_dir):
    """Record from RealSense camera."""
    video_writer = cv2.VideoWriter(os.path.join(output_dir, 'realsense_video.avi'), cv2.VideoWriter_fourcc(*'XVID'), 30, (1280, 720))
    depth_pub = rospy.Publisher('/camera/depth/image_raw', Image, queue_size=10)
    color_pub = rospy.Publisher('/camera/color/image_raw', Image, queue_size=10)
    bridge = CvBridge()

    frame_interval = 1.0 / 30  # Frame interval for 30 FPS

    try:
        while True:
            start_time = time.time()

            frames = pipeline.wait_for_frames()
            depth_frame = frames.get_depth_frame()
            color_frame = frames.get_color_frame()
            if not depth_frame or not color_frame:
                continue

            depth_image = np.asanyarray(depth_frame.get_data())
            color_image = np.asanyarray(color_frame.get_data())

            # Convert the depth and color images to ROS Image messages
            depth_image_msg = bridge.cv2_to_imgmsg(depth_image, encoding="passthrough")
            color_image_msg = bridge.cv2_to_imgmsg(color_image, encoding="bgr8")

            # Use ROS time for timestamping
            timestamp_ros = rospy.Time.now().to_sec()
            np.savez_compressed(os.path.join(output_dir, f'depth_{timestamp_ros}.npz'), depth_image=depth_image)

            # Add the timestamp to the color image
            font = cv2.FONT_HERSHEY_SIMPLEX
            font_scale = 0.5
            color = (255, 255, 255)  # White color
            thickness = 1
            position = (10, 30)  # Top-left corner
            cv2.putText(color_image, str(timestamp_ros), position, font, font_scale, color, thickness, cv2.LINE_AA)

            # Save the color image with the timestamp as raw file
            np.savez_compressed(os.path.join(output_dir, f'color_{timestamp_ros}.npz'), color_image=color_image)

            video_writer.write(color_image)
            depth_pub.publish(depth_image_msg)
            color_pub.publish(color_image_msg)

            # Display the images (optional)
            cv2.imshow('RealSense Color Image', color_image)
            key = cv2.waitKey(1)
            if key & 0xFF == ord('q'):
                break

            # Ensure the loop runs at the correct frame rate
            elapsed_time = time.time() - start_time
            time.sleep(max(0, frame_interval - elapsed_time))
    finally:
        pipeline.stop()
        video_writer.release()
        cv2.destroyAllWindows()

if __name__ == '__main__':
    rospy.init_node('mocap_realsense_recorder', anonymous=True)

    # Define the directory to store files
    output_dir = os.path.join(os.path.expanduser('~'), datetime.now().strftime("%Y%m%d_%H%M%S"))
    if not os.path.exists(output_dir):
        os.makedirs(output_dir)

    # Get mocap topics
    mocap_topics = get_mocap_topics()
    if not mocap_topics:
        print("No topics found under /mocap_node")
        exit(1)

    all_topics = mocap_topics + ['/camera/depth/image_raw', '/camera/color/image_raw', '/tf']

    # Record rosbag
    bag_process = record_rosbag(all_topics, os.path.join(output_dir, 'mocap_data.bag'))
    time.sleep(2)  # Give some time for rosbag to initialize

    # Record RealSense data
    realsense_pipeline = setup_realsense()
    try:
        record_realsense(realsense_pipeline, output_dir=output_dir)
    except KeyboardInterrupt:
        pass
    finally:
        # Terminate the rosbag process
        bag_process.terminate()

I created a script that uses YOLOv8 Pose to obtain the 3D skeleton using my RealSense D435i. When I use the camera live, it works correctly.

Subsequently, I created a dataset by saving videos in both ROSBAG and .npy files, including both depth and RGB videos. However, when I try to apply my algorithm to these videos (which seem to be well synchronized), I experience a loss of information in the depth data. I can't recreate the skeleton because it seems that the depth frames are less precise and have holes, especially around the human silhouette.

The intrinsic parameters have been used correctly. Do you have any advice or can you explain why, when saving to ROSBAG with the script mentioned above, I experience this loss of quality? I just want the videos saved in ROSBAG or as individual .npy frames to work the same as when I use the RealSense directly.

Thanks in advance.

Hi @polmagri It looks as though your script is converting RealSense frames to OpenCV format. Is that what is happening, please? If it is then you could be losing some quality in the conversion process.

RealSense depth frames are in a 16-bit format - uint16_t - and converting 16-bit RealSense depth frames to OpenCV images will typically result in an image that is only 8-bit unless it is specified in the script that frames should be converted to a 16-bit OpenCV format, such as CV_16UC1 - please see https://github.com/IntelRealSense/librealsense/issues/12901#issuecomment-2097888508

In regard to the silhoutte around human figures, this sounds like a phenomenon called occlusion that is common in RealSense images of humans.

Removal of occlusion - called 'occlusion invalidation' - is handled automatically in the RealSense SDK when generating 3D pointclouds. For non-pointcloud 2D images you might have to apply hole-filling post-processing to the images in real-time before recording the frames. Hole filling post-processing can be done with the Spatial filter or with the Hole-Filling filter.

More information about occlusion invalidation can be found in Intel's guide 'Projection, Texture-Mapping and Occlusion with Intel RealSense Depth Cameras' at the link below.

https://dev.intelrealsense.com/docs/projection-texture-mapping-and-occlusion-with-intel-realsense-depth-cameras#43-occlusion-invalidation

From the previous script, it's clear that I saved the depth images in 16-bit using the 'z16' format, and in the script where I use them to apply YOLO, I import and synchronize them using these functions:

def extract_and_process_rosbag(bag_file):
    bridge = CvBridge()
    rgb_images = []
    depth_images = []

    with rosbag.Bag(bag_file, 'r') as bag:
        for topic, msg, t in bag.read_messages(topics=['/camera/color/image_raw', '/camera/depth/image_raw']):
            if topic == '/camera/color/image_raw':
                rgb_image = bridge.imgmsg_to_cv2(msg, desired_encoding='bgr8')
                rgb_images.append(rgb_image)
            elif topic == '/camera/depth/image_raw':
                depth_image = bridge.imgmsg_to_cv2(msg, desired_encoding='passthrough')
                depth_images.append(depth_image)

    assert len(rgb_images) == len(depth_images), "The number of RGB and depth frames do not match."

    return rgb_images, depth_images

And then:

try:
    for color_image, depth_image in zip(rgb_images, depth_images):
        # Remove any filtering step here
        depth_filtered = depth_image

        # Run the YOLO model on the frames
        persons = model(color_image)

        # Normalize depth image for display
        depth_image_display = cv2.normalize(depth_filtered, None, 0, 255, cv2.NORM_MINMAX)
        depth_image_display = cv2.convertScaleAbs(depth_image_display)

        # Convert depth image to color for better visualization
        depth_image_colormap = cv2.applyColorMap(depth_image_display, cv2.COLORMAP_JET)

Could you help me identify any errors I might be missing? Is it possible that the issue is due to the color images being saved in 8-bit while the depth images are in 16-bit, leading to some quality loss in the color images?

I do not have advice to offer about the code, as its combination of RealSense, rospy, OpenCV and YOLO code exceeds my programming knowledge, unfortunately. I do apologize.

Although the RealSense SDK's Z16 format is 16-bit, its RGB8 format is already 8-bit and so saving color as an 8-bit image should not significantly affect it.

Regarding depth though, even if depth is saved to a 16-bit image then there will likely be some loss of depth information unless it is saved in an image format that preserves depth, such as .raw. In regard to non-image formats, .bag and .npy do preserve depth information.

I understand from this case that you are saving in .bag and .npy format. Could you provide more information about the reason for also saving a video in .avi please? Is that for YOLO?

There is a Python script at https://github.com/IntelRealSense/librealsense/issues/4934#issuecomment-537705225 that saves .npy as an array of scaled matrices. If you have not seen it already then it might be a helpful reference to compare to your own code.

Hi @polmagri Do you require further assistance with this case, please? Thanks!

Case closed due to no further comments received.

IntelRealSense / realsense-ros

Loss of quality in depth data when saving videos in ROSBAG or .npy frames #3176

Issue Description