Closed sch1eicher closed 4 years ago
Intel has a Python tutorial program for RealSense that takes the approach of aligning depth and RGB data stored in a 'bag' format file and then applying "MobileNet-SSD" object detection to the aligned image and defining a 'bounding box' around the identified object so that the identified object's distance from the camera can be calculated.
I hope that this tutorial can offer some useful insights for your own project.
https://github.com/IntelRealSense/librealsense/blob/jupyter/notebooks/distance_to_object.ipynb
The link below suggests Python code for cropping an 'RGB aligned to depth' image to a region of interest (ROI) on a point cloud.
https://github.com/IntelRealSense/librealsense/issues/2769#issuecomment-442035864
Many thanks for your suggestions. I need a ply file in order to perform a pose estimation of the detected objects.
I´ve used the code from the #2769 comment, but the RGB and the depth images are still not aligned. For example, when I crop a ROI of xmin, xmax, ymin, ymax = 410, 860, 300, 540
(red box in the RGB picture) from the depth data (red area in the depth image) then this ROI is not aligned to the area I´ve chosen in the RGB picture (I´ve used bricks to illustrate my issue).
Maybe it's a issue with my code:
import pyrealsense2 as rs
import numpy as np
import open3d as o3d
import math as m
def get_image():
# Define some constants
resolution_width = 1280 # pixels
resolution_height = 720 # pixels
frame_rate = 15 # fps
# Create a pipeline
pipeline = rs.pipeline()
# Enable depth, infrared, color streams
config = rs.config()
config.enable_stream(rs.stream.depth) # , resolution_width, resolution_height, rs.format.z16, frame_rate)
config.enable_stream(rs.stream.infrared) # , 1, resolution_width, resolution_height, rs.format.y8, frame_rate)
config.enable_stream(rs.stream.color) # , resolution_width, resolution_height, rs.format.bgr8, frame_rate)
pc = rs.pointcloud()
pipeline.start(config)
colorizer = rs.colorizer()
align_to = rs.stream.color
align = rs.align(align_to)
colorizer = rs.colorizer()
align = rs.align(rs.stream.depth)
# skip the fist Frames
for i in range(10):
# Get frameset of color and depth
frames = pipeline.wait_for_frames()
aligned = align.process(frames)
color_aligned_to_depth = aligned.first(rs.stream.color)
depth_frame = frames.first(rs.stream.depth)
color_frame = aligned.get_color_frame()
# Apply filters
filters = [rs.disparity_transform(),
rs.spatial_filter(),
rs.temporal_filter(),
rs.disparity_transform(False)]
for f in filters:
depth_frame = f.process(depth_frame)
color = np.asanyarray(color_frame.get_data())
points = pc.calculate(depth_frame)
w = rs.video_frame(depth_frame).width
h = rs.video_frame(depth_frame).height
print('h: ' + str(h) + 'w: ' + str(w))
vertices = np.asanyarray(points.get_vertices()).view(np.float32).reshape(h, w, 3)
# Stop pipeline
pipeline.stop()
return color, vertices
color_image, verts = get_image()
xmin, xmax, ymin, ymax = 410, 860, 300, 540 # Bounding Box
roi = verts[ymin:ymax, xmin:xmax, :]
# Pass xyz to Open3D.o3d.geometry.PointCloud
reshaped_roi = np.reshape(roi, (-1, 3))
print(reshaped_roi.shape)
reshaped_verts = np.reshape(verts, (-1, 3))
pcd = o3d.geometry.PointCloud()
pcd.points = o3d.utility.Vector3dVector(reshaped_roi)
o3d.io.write_point_cloud("reshaped_roi.ply", pcd) # Red in the second picture
pcd_verts = o3d.geometry.PointCloud()
pcd_verts.points = o3d.utility.Vector3dVector(reshaped_verts)
o3d.io.write_point_cloud("reshaped_verts.ply", pcd_verts) #
Apologies for the delay in responding, as I was carefully considering your code.
It looks as though you are applying align before post-processing. It is recommended that align is applied after post-processing, otherwise the image can be affected by a type of distortion called aliasing, which can produce effects such as jagged lines.
Also in the original Python ROI script that I linked to, the boundary values are set before depth_frame is defined.
Many thanks for your response. I´ve applied your suggested changes. Actually, the main issue was that the RGB image data, for reasons that are unknown to me, were captured with a resolution of 1920x1080 instead of 1280x720, and therefore the image data did not match with the depth data. Now my program works like it is supposed to.
Thank you very much for your time and support.
You are really welcome. Awesome news that you found a solution - thanks for the update! :)
Hey @MartyG-RealSense, you said:
It looks as though you are applying align before post-processing. It is recommended that align is applied after post-processing, otherwise the image can be affected by a type of distortion called aliasing, which can produce effects such as jagged lines.
In the rs-measure example, align is done before post-processing. But you would suggest it the other way arround?
Also in the original Python ROI script that I linked to, the boundary values are set before depth_frame is defined.
By boundary values you mean the bounding box corners? Why did you highlight this?
As far as i understand, they are only used in the line:
roi = verts[ymin:ymax, xmin:xmax, :]
Doesn't really matter as long as they are defined until this point, does it?
Is there any difference in using the alignment like this
auto processed = align.process(frameset);
or like this
data = data.apply_filter(align_to);
?
Does someone know a handy alternative to this nice python/numpy syntax for cropping in C++?
@PerceptMD The recommendation is to do align after post-processing in order to avoid distortion effects such as aliasing (jagged lines). I have had a couple of cases though where the only way that the RealSense user's program would work without errors is if they did alignment before post-processing. I don't think that the program will fail if align comes before post-proc, but doing align after is the recommendation if possible.
I guess you could think of the boundary box horizontal and vertical min-max values in terms of defining the corners of the box. I don't know enough about boundary box programming to comment on it further than that though, as it isn't something that I have done myself.
I believe that align.process() refers to doing alignment with an Align processing block
In the C# wrapper, align_to was made obsolete in favor of doing alignment with a processing block.
https://github.com/IntelRealSense/librealsense/issues/5582#issuecomment-596092109
If you are able to make use of OpenCV in your C++ project then this link for defining a bounding box may be helpful:
https://stackoverflow.com/questions/14505679/drawing-bounding-box-using-opencv-in-c-enviroment
Thanks for your quick reply.
So if possible the order could be changed in rs-measure because this lead me to align before processing.
Okay so align.process() would be the way to go.
Thanks for the link but drawing the box is not my concern. I would be interested in some fancy 2-line code for doing this:
vertices = np.asanyarray(points.get_vertices()).view(np.float32).reshape(h, w, 3)
roi = verts[ymin:ymax, xmin:xmax, :]
in C++, if possible.
This is the only C++ method that I personally know of for defining the X-Y of a bounding box with C++:
https://github.com/IntelRealSense/librealsense/issues/2016#issuecomment-403804157
Thanks a lot. ;) I guess reducing the vertices to the ones inside the bounding box wont be a 2 liner in C++, but "-Iterate through your 3Dpoints doing some filtering based on BoundingBox variables, discarding points outside of the box" will do the trick.
Sorry to border you again, but I still have some issues with the correct alignment of the depth and RGB data.
As hopefully visible in the image below, the masks (blue) which were perfectly recognized in the RGB image by a CNN, are not aligned with the real objects (in the depth image). There is a slight shift to the right. I would comprehend if this shift would emerge in the other direction, since the origin of the depth image and the RGB sensor are 15mm apart, but the shift emerges in the other direction.
I´ve found some hints (e.g. #2445) to overcome this issue by switching on “Heuristic” in the RealSense-Viewer and using occlusion filters, but I haven´t found any functions in the pyrealsense2 library.
Do you have any hints to overcome this?
@sch1eicher It's no bother at all :)
When aligning depth to color, the depth image should be aligned to the FOV size of the color sensor. Aligning to the color FOV is necessary because although the D415 camera models use the same imager component for both RGB and IR, on the D435 the IR imager component has a larger "wide" size (compared to the "standard" IR imager size of D415).
Advice about use of the occlusion removal in your own program is provided in the link below:
https://github.com/IntelRealSense/librealsense/issues/5242#issuecomment-563457627
Many thanks for your fast response :) May I´m wrong, but shouldn´t the “align” command
align = rs.align(rs.stream.color)
aligned = align.process(frames)
already align the depth image to the FOV size of the color sensor?
A Python alignment example program in the SDK called align_depth2color.py expresses the alignment like this:
align_to = rs.stream.color align = rs.align(align_to) try: while True: frames = pipeline.wait_for_frames() aligned_frames = align.process(frames)
It would look more like your own script if a variable substitution is not used for align:
align = rs.align(rs.stream.color) try: while True: frames = pipeline.wait_for_frames() aligned_frames = align.process(frames)
Issue Description
Hi, I'm quite new to the github comunity, therefore I am sorry if this issue is placed wrongly.
My current program processes the RGB data (from the camera) in a CNN (Mask R-CNN) and returns polygons of the recognized objects. I would like to crop the recognized objects from the depth data (stored in a ply file) for further processing (pose estimation). The problem is that the depth data is not correctly aligned with the RGB data and I haven't found a solution in python to crop only the points from the ply file.
When I use the 3D view in RealSensce Viewer (2.29.0) the depth and RGB data seems like to be aligned perfectly.
If you might got a hint, I would be thankful.