IntelRealSense / librealsense

Intel® RealSense™ SDK
https://www.intelrealsense.com/
Apache License 2.0
7.52k stars 4.81k forks source link

Using RS2 pipeline and V4l2 (rs-multicam and Jetson-inference detectnet camera) #5994

Open benjamina97 opened 4 years ago

benjamina97 commented 4 years ago

Required Info
Camera Models D435, T265
Firmware Version D435 5.12.01.00
Firmeware Version T265 0.20.879
Operating System & Version Linux Ubuntu 18.04.4 LTS, Jetpack 4.3
Kernel Version 4.9.140-tegra
Platform NVIDIA Jetson Nano
SDK Version librealsense v2.31.0 and Jetson Inference
Language C++
Segment Robot, Object Recognition

Issue Description/quick Background

Hello! I am doing robotics project in which I am using the Intel D435 and T265 RealSense cameras with a NVIDIA Jetson Nano.

Ideally, the Nano will use the video stream from the D435 and use object recognition and get depth info for the detected objects.

I have modified the rs-multicam.cpp example in order to gather Depth and SLAM simultaneously. Seen in this gif (Its displaying Depth from center pixel, pose and YAW rotation from D435 and T265):

libreal

For object recognition, I want to use the Jetson-inference examples because the run very well on the Nano (~24 FPS). Live detection link: https://github.com/dusty-nv/jetson-inference/blob/master/docs/detectnet-camera-2.md

I have been able to make detectnet-camera work with the D435 from grabbing the RGB V4L2 stream from /dev/video2

this is accomplished by running

./detectnet-camera --camera /dev/video2

detectnet

Problem (putting them together)

Here is the problem...

I don't know combine these programs into one OR have them separately running at at the time.

Whenever the Realsense program runs(first gif) the object recognition(second gif) can't run and vise versa. I suspect this is because the RS2 pipeline take over all of the camera streams.

The detectnet program expects a V4L2 stream, which I am not sure how to get from the RS2 pipeline (or if it is even possible). The solution may be to feed the RS2 video frame to detectnet by converting it to some RGBA format that it gets from the V4L2.

Alternatively (but not as good solution) there is a detectnet-console which can do object detection on single images. I could periodically save pictures to the hard drive then run them through detectnet, get the object coordinates and then get depth info. I tried implementing this but I ran into segmentation faults whenever it tried to save the images to the hard drive. I might add this code to another issue if I can't figure out the seg fault.

I would appreciate an advice I can get. I am trying to read through the RS2 api but wasn't sure what the best solution is. If something doesn't make sense, please let me know and I will clarify and/or send more code samples.

Thank you!

MartyG-RealSense commented 4 years ago

According to the RealSense documentation for compiling Librealsense for Ubuntu, "Linux build configuration is presently configured to use the V4L2 backend by default".

For the RGB sensor though, the native format is described as being YUYV (fourcc 'YUY2').

MartyG-RealSense commented 4 years ago

Hi again @benjamina97

A dedicated T265 engineer on the RealSense support team will be taking up this case from this point onwards. Good luck!

benjamina97 commented 4 years ago

Thanks @MartyG-RealSense! I didn't realize that about the RGB stream. I appreciate the help.

RealSenseSupport commented 3 years ago

Thank you for highlighting the issue multicam using T265. We have moved our focus to our next generation of products and consequently, we will not be addressing this issue in the T265.