Setup details for multicamera with D435

mjsobrep commented 5 years ago

Required Info
Camera Model	D435
Firmware Version	Current
Operating System & Version	Ubuntu 16
Kernel Version (Linux Only)	current
Platform	Intel NUC 7 i5
SDK Version	2
Language	python, C, ROS
Segment	Robot

Issue Description

I'm trying to achieve a dual camera configuration with two D435 cameras to track human targets at anywhere from .5-4 meters. I need the two cameras to achieve a sufficient field of view. This is discussed already in #3836 with a summary of what is available, #3922 about bottlenecks, #2736 with calibration details, #2637 with tons of implementation details, especially around getting things synchronized and running color and depth pipelines. I still have a few questions though:

What considerations should be made when physically aligning the cameras? How much overlap should the fields of view have? How are parallax errors minimized? Is it better to have the cameras pointing inwards and crossing over or outwards in a fan shape?
Where is the optical center of the cameras (might help with alignment)?
My understanding is that the hardware sync only affects the depth streams. How then does one get synchronized RGB to the depth?
Is stitched together multicamera going to show up in the API and/or ROS wrapper at some point?
Am I approaching this all wrong? Is there something else I should be doing to track people at close range?

Thanks

MartyG-RealSense commented 5 years ago

You may find it useful to watch on YouTube a 2018 presentation on multiple cameras that Intel held.

https://www.youtube.com/watch?v=drBCxHhbxI0

It featured suggestions for the orientation of a pair of cameras, as shown in the image extract below taken from 5 minutes 40 seconds into the video,

In this section of the presentation, an example was also given of an outward facing camera shape for D415's FOV angles (there was not a chart for D435).

If you would like to go far deeper into the setup of multiple cameras and their fields of view, I recommend Intel's AR / VR body tracking presentation.

https://www.youtube.com/watch?v=VSHDyUXSNqY&t=571s

I hope the discussion in the link below helps to answer your question about the camera's center point.

https://forums.intel.com/s/question/0D50P0000490TmDSAU/the-origin-of-depth?language=en_US

2637 had the best discussion of color and depth sync that I know of, though one of the Intel guys may be able to add to this for you with new insights since the time that the reply about the difficulties of multi-cam color and depth sync was given in that discussion by @dorodnic .
There are a couple of Intel articles that show how to use 2 cameras or 3 cameras with ROS to create a semi-unified point cloud.

https://github.com/IntelRealSense/realsense-ros/wiki/Showcase-of-using-2-cameras

On a larger scale, the CONIX Research Center at Carnegie Mellon used ten to twenty 400 Series cameras and an Ethernet network to stitch together point clouds.

https://github.com/conix-center/pointcloud_stitching

mjsobrep commented 5 years ago

@MartyG-RealSense All of that was incredibly helpful, thank you.

From all of these resources, I am gathering that I really should be using a pair of D415 cameras instead of D435 cameras. It seems like the D415 has better spatial resolution, has better hardware sync, keeps the camera's in calibration better, and might benefit from the color over IR stereo.

I'm still a little not clear as to the ideal spacing of multiple cameras. The basic configurations are well covered in the videos, but thinking through how to do the spacing between cameras, angular offsets, etc is still leaving me scratching my head. If anyone has intuition on that or has played around with different setups and could share insights, that would be helpful.

MartyG-RealSense commented 5 years ago

I recall that there was a recent case of a RealSense user who wanted to do similar human tracking to you. Their approach to the FOV problem was to mount a single camera on a wall above head height, pointing downward a little. Another user recently also took this elevated-view approach. An advantage of it is that it prevents people from evading the camera by crawling under its view, since it includes the floor in its view. For a more complex human tracking application (reading body data instead of just reacting to their presence like a motion detector), multiple cameras pointing straight forwards like the diagrams above will be preferable.

https://github.com/IntelRealSense/librealsense/issues/2723#issuecomment-497637299

In regard to using a D415: though it has a slower shutter than the D435 (making the D435 better at tracking fast motion), the D415 should be able to cope with humans at walking pace without problems.

If you are mounting the cameras horizontally instead of vertically, for a wide view instead of a tall one, the more cameras the better, as it reduces blind spots and creates better data due to redundancy of data from more than one camera covering the same area due to FOV overlap. Four cameras would be optimal for a 180 degree view, though for a more restricted view, 2 cameras mounted horizontally is manageable.

Below is an example of a 180 degree setup with 4 D435.

https://www.intelrealsense.com/intel-realsense-volumetric-capture

If you look at the main header image of four cameras spaced out and just look at the two in the middle, ignoring the two outermost ones, that may give a good visual indicator of horizontal spacing for a 2-cam setup.

mjsobrep commented 5 years ago

That's an interesting idea, trying to use the scenery to prevent exiting the field of view. In my application, a mobile robot within a meter or so of pediatric subjects is trying to track the arms and face of the subjects. I think given your suggestions I will try two layouts, one with the cameras spaced vertically both facing in the same direction and one with them close together angled out.

So the D415 depth sensor is a rolling shutter. But the color cameras have rolling shutters on both models, right? In the example you shared at https://youtu.be/VSHDyUXSNqY they were able to use OpenPose on the color successfully, so I think that the rolling shutter should be OK. Even for ballistic arm motions, I am hoping that I can get good wrist position data (wrists are probably the most important target for my application).

Thanks for all of your help @MartyG-RealSense you have been able to point me to a lot of resources which I could not otherwise find.

MartyG-RealSense commented 5 years ago

You're very welcome. I'm glad I could be of help. :)

The color imagers are rolling shutter, yes. The faster global shutters on the D435 are useful for measuring fast-changing depth, such as a the real-time distance away of an oncoming car or outdoor scenery approaching a car-mounted camera. Autonomous vehicles were one of the applications that the 400 Series cameras were designed for possible use in, as well as drones and robotics.

You may be interested in an organization who has used RealSense and skeletal tracking to track the motions of school children. It uses the D415, which suggests that the rolling shutters of the D415 should be able to cope with fast arm motions by children.

https://www.intelrealsense.com/active-learning-with-intel-realsense-technology/

mjsobrep commented 5 years ago

That application by prowise is very cool. Definitely helps build confidence that I can do what I need to with the D415. I'm going to go pick a couple up.

jonra1993 commented 4 years ago

Hi @MartyG-RealSense I ma working on multicamera calibration. I have 3 realsense D415 cameras located in different positions, all pointing to same area, I am trying to work at real time. I did aligning process in manual way using realsense-ros I am currently using post-processing filters and PCL filters. I have noticed result pointclouds have much noise and problems with flat surfaces, which makes difficult calibration and PCL filtering. Do you have any suggestion in order to improve accuracy of pointclouds but also do you know a method for automating extrinsic calibration process?

Screen Shot 2019-11-21 at 14 33 56

MartyG-RealSense commented 4 years ago

Pages 17 and 18 of the Programmers Guide for creating your own custom version of the Dynamic Calibrator tool discuss using target-less calibration in conjunction with a gimbal to update calibration in real time. The gimbal moves the camera independently based on scene detection results.

https://www.intel.com/content/www/us/en/support/articles/000026724/emerging-technologies/intel-realsense-technology.html

jonra1993 commented 4 years ago

@MartyG-RealSense Thank for you response. I read PDF file, but I am looking for something which allow me to do extrinsic calibration between multiple cameras. I read is some documentation that Intel recommends Vicalib. I would you like to know if there is any example of how using Vicalib with Intel realsense I have both 415 and 435.

MartyG-RealSense commented 4 years ago

An Intel RealSense team member said in the past week that Vicalib no longer works with the 400 Series cameras, though still does with the T265 tracking camera.

The CONIX Research Center at Carnegie Mellon developed a multicam point cloud stitching system with calibration where the captured data was sent to a central computer for processing and then viewing with PCL. Their calibration method incorporated an expensive precision surveying device called a theodolite that my research indicated cost on average between $600 and $1500 on Amazon.

https://github.com/conix-center/pointcloud_stitching

An Intel multicam demo in January 2018 took the approach of capturing with 4 separate PCs whose D435 cameras (one on each PC) were hardware synched, and sending the point cloud data to a 5th PC for combining and then doing post processing with the help of Unity.

https://www.intelrealsense.com/intel-realsense-volumetric-capture/

IntelRealSense / librealsense

Setup details for multicamera with D435 #4653

Issue Description

2637 had the best discussion of color and depth sync that I know of, though one of the Intel guys may be able to add to this for you with new insights since the time that the reply about the difficulties of multi-cam color and depth sync was given in that discussion by @dorodnic .