IntelRealSense / librealsense

Intel® RealSense™ SDK
https://www.intelrealsense.com/
Apache License 2.0
7.6k stars 4.83k forks source link

Apply filters to numpy images #11788

Closed saurabheights closed 1 year ago

saurabheights commented 1 year ago

Required Info
Camera Model { D455 }
Firmware Version (Open RealSense Viewer --> Click info)
Operating System & Version Linux (Ubuntu 22.04)
Kernel Version (Linux Only) 5.19.0-41-generic
Platform PC
SDK Version 2.53.1.4623
Language {python}
Segment {AR and 3D reconstruction }

Issue Description

Hi, I have implememted a multi-camera capture pipeline that allows me to read from multiple cameras and do recording, processing, etc. For offline processing, fast prototyping and reproducibility, I chose to work with dumped dataset. Here, recording means saving aligned color and depth images to the disk along with camera intrinsics. I have a PseudoCameraManager class which reads dumped dataset and I am able to do multi-camera pose estimation and merge the point clouds from multiple cameras.

The issue I am facing is in how to apply post processing filters to numpy arrays.

depth = o3d.t.io.read_image(depth_path)
depth_np = np.asarray(depth)
r = rs.frame()
r.data = depth_np
Traceback (most recent call last):
  File "/home/sk/anaconda3/envs/lufthansa/lib/python3.10/site-packages/IPython/core/interactiveshell.py", line 3505, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-9-03aedff5c7bb>", line 1, in <module>
    r.data = depth_np
AttributeError: can't set attribute

Is there a way to generate realsense frame or alternatively use filters? If not, will you be interested in a PR?

I am new to repo, so forgive the level of naivety. I will deep dive this week and see the complexity that I will have to face. My initial thoughts are either we can focus on providing a way to convert numpy array to frame or simplify filter interface to accept a simpler data structure than frame wherever possible.

P.S. I have gone through issues:- https://github.com/IntelRealSense/librealsense/issues/2551 https://github.com/IntelRealSense/librealsense/issues/5784

MartyG-RealSense commented 1 year ago

Hi @saurabheights Since https://github.com/IntelRealSense/librealsense/issues/5784 I am not aware of any further progress that has been made in converting numpy to rs2::frame.

At the link below, Intel provide a Python tutorial for using post-processing filters with numpy arrays.

https://github.com/IntelRealSense/librealsense/blob/jupyter/notebooks/depth_filters.ipynb

saurabheights commented 1 year ago

@MartyG-RealSense Thank you for your answer, but I don't think thats correct. depth_filters.ipynb works on pyrealsense2.pyrealsense2.depth_frame when applying filters, followed by depth_frame.get_data call -> np.asanyarray -> matplotlib.pyplot. What I need is reverse, something like depth_frame.from_data...

MartyG-RealSense commented 1 year ago

Data that has been converted from rs2::frame can be converted back to rs2::frame using the RealSense SDK's software-device interface, though it does not always work with pyrealsense2. A Python example use of software-device can be found at https://github.com/IntelRealSense/librealsense/issues/7057

mistimohan commented 1 year ago

I am interested in this feature too. Presently I am using rs2_deproject_pixel_to_point to convert the modified numpy array into a point cloud as mentioned in https://medium.com/@yasuhirachiba/converting-2d-image-coordinates-to-3d-coordinates-using-ros-intel-realsense-d435-kinect-88621e8e733a .I want to know if there has been any better solution to convert the altered numpy array into a point cloud without one-by-one pixel-to-point conversion.I am comfortable using both realsense ros or pyrealsense. Thank You.

MartyG-RealSense commented 1 year ago

Hi @mistimohan The main alternative to using rs2_deproject_pixel_to_point to generate a pointcloud is to use map_to and pc.calculate to map color onto a depth pointcloud, like in the example Python code at https://github.com/IntelRealSense/librealsense/issues/4612#issuecomment-566864616 (though this does not help if you want to turn the numpy array back into rs2::frame again afterwards).

mistimohan commented 1 year ago

Thank You for reply @MartyG-RealSense but my concern is :

        def main():
            try:
                print("start")
                # Create a CvBridge object to convert ROS messages to OpenCV images
                bridge = CvBridge()    
                camera_info_msg = rospy.wait_for_message('/camera/aligned_depth_to_color/camera_info', CameraInfo)
                # depth_image_msg = rospy.wait_for_message('/camera/depth/image_rect_raw', Image)
                depth_image_msg = rospy.wait_for_message('/camera/aligned_depth_to_color/image_raw', Image)

                print('got depth image')
                depth_image = bridge.imgmsg_to_cv2(depth_image_msg, desired_encoding="passthrough")
                print(depth_image.shape,type(depth_image),depth_image.dtype,depth_image[100,100])

                rgb_image_msg = rospy.wait_for_message('/camera/color/image_raw', Image)

                # Convert the RGB image message to a numpy array
                rgb_image = bridge.imgmsg_to_cv2(rgb_image_msg, desired_encoding="bgr8")

                masked_image=generate_mask(rgb_image)
                masked_image[(masked_image >0)]=1
                masked_depth_image=np.zeros_like(depth_image)
                masked_depth_image = np.multiply(depth_image, masked_image)

                points=[]
                for i in range(0,masked_depth_image.shape[0]):
                    for j in range(0,masked_depth_image.shape[1]):
                        # print(f"x is : {i}, y is {j}, depth is :")
                        # print(depth_image[i,j]*0.001)
                        result = convert_depth_to_phys_coord_using_realsense(i,j,masked_depth_image[i,j]*0.001,camera_info_msg)
                        x = result[2]
                        y = -result[1]
                        z = -result[0]
                        # print(x,y,z)
                        points.append([x+0.8,y+0.4,z+1.8,z])

As per I know I cannot send this modified depth image(with region of interest) into pc.calculate() as you said "(though this does not help if you want to turn the numpy array back into rs2::frame again afterwards".But converting every point one by one using rs2.deproject_pixel_to_point doesn't seem efficient in this case.Is there any workaround for this situation ?

MartyG-RealSense commented 1 year ago

Instead of converting every coordinate in the image, you can process more efficiently by converting a single individual coordinate from a 2D XY color pixel to a 3D XYZ depth pixel using the instruction rs2_project_color_pixel_to_depth_pixel like in the Python script at https://github.com/IntelRealSense/librealsense/issues/5603#issuecomment-574019008

mistimohan commented 1 year ago

Okay, I would try this and let you know, thank you for your quick response. @MartyG-RealSense

MartyG-RealSense commented 1 year ago

You are very welcome, @mistimohan - I look forward to your next report. Thanks very much for the update!

MartyG-RealSense commented 1 year ago

Hi @saurabheights Do you require further assistance with this case, please? Thanks!

saurabheights commented 1 year ago

@MartyG-RealSense We have taken a pause on this direction at the moment. We will do some experimentation with Neural Rendering Fields before we come back to this.

I will check https://github.com/IntelRealSense/librealsense/issues/11788#issuecomment-1543861896 and if it solves the problem, I will let you know here. If you prefer, you can close the issue for now, and I can reopen if the issue persists.

Thank you for your help.

MartyG-RealSense commented 1 year ago

As you will be working on Neural Rendering Fields first, I will close this for now and you are welcome to re-open it later if you need to. Thanks very much for the update!

saurabheights commented 1 year ago

@MartyG-RealSense - I implemented the temporal filter in python using numpy. It was quite fast to get a trivial implementation done (Similar to Always ON case) which satisfies our need of static camera + static scene.

I will implement age computation of last valid value seen for a given pixel to ensure we dont show too old prev value and that should take care of valid X in last Y frames options provided by realsense viewer.

Just dropping this information to wrap up the ticket. Thank you for your assistance.

MartyG-RealSense commented 1 year ago

Thanks so much @saurabheights for saring your approach to a solution!

SalmaG98 commented 1 year ago

@saurabheights can you please share with me your implementation of the filters or your way of casting a numpy array as a rs.frame() ? I have a very stretched point cloud that I obtain by converting a depth map saved as a numpy array using this function. Since the obtained point cloud is very stretched, I have considered first filtering the depth numpy array I have before converting it. I have considered using the filters mentioned here like here in the putting everything together cell.

saurabheights commented 1 year ago

@SalmaG98 Sorry, but unfortunately I cannot disclose the implementation of filter as this work was done for the company.

BTW, I was not able to convert numpy array to rs.frame.

SalmaG98 commented 1 year ago

It's okay, thank you for your answer :)