Point cloud parameters - Githubissues

niwhsa9 commented 2 years ago

@arschallwig can you summarize what you found out so far? I remember that you were having trouble finding a way to actually simulate the point cloud parameters for a stereo cam due to how the library represents the parameters.

arschallwig commented 2 years ago

Yeah I'll write something up this evening

arschallwig commented 2 years ago

@niwhsa9 When trying to accurately recreate the ZED 2i in ROS, I came across a few points preventing an obvious clear route:

First note, the ZED defines its intrinsics lens-by-lens. This makes the currently supported ROS depth camera plugin difficult to get to match the ZED because the ROS depth camera is treated as a ToF-based sensor, with one emitting 'lens', rather than two. This means we can't really specify or recreate the exact intrinsics of the ZED using the ROS depth camera plugin
There are other sensors available though. There is a stereo camera which can have two discrete lenses, each with their own intrinsics, but they don't (inherently) mesh to create point clouds as the ZED does. Similarly there are also single-lens cameras we can use
My two current thoughts for how we could resolve this are:
1. Use this ROS package which should allow us to mesh two independent camera views and form a point cloud from them. (Need to check on this, but we may not be able to use the ROS-Gazebo built in stereo camera as our sensor though, meaning it may introduce some weirdness having to create two separate cameras). This route would obviously be good because it is the most consistent with what the ZED is doing on the real rover, but may introduce some overhead in setup, and the physical representation as two cameras is obviously not entirely accurate to how the ZED is packaged
2. We attempt to recreate the point cloud that we get from the ZED with the built-in depth camera plugin provided in ROS. The positive here is the depth camera plugin is easy to edit and is already implemented, however it could be hard to recreate faithfully the point cloud that we get from the ZED without copying its intrinsic parameters

niwhsa9 commented 2 years ago

Thanks for that great summary and set of options, very nice work.

I've written out some of my thoughts below. I apologize in advanced because this got very long-winded and some of it is not directly related to the issue at hand, but I promise I have some practical points in here.

Lets start with the tangent:

I think fundamentally there are two classes of things we are interested in simulating accurately when it comes to depth cameras:

Certain physical/intrinsic parameters. Namely vertical FOV (or focal), horizontal FOV (or focal) and minimum/maximum range. I actually don't think we care about things like distortion coefficients, centers, and skews because ZED undistorts prior to producing output data anyway
Stereo matching algorithm artifacts. This could be any number of things. For example, I've heard of stereo algorithms that enforce consistency constraints in depth gradients. This means that at the edges of discrete objects there is essentially an interpolation that occurs from the object to the background. I actually haven't seen this in ZED so I don't think it does this particular thing. One thing I have actually seen are random missing patches in point clouds, likely because ZED won't produce readings when the match falls below some certainty metric.

When I assigned you this issue I had not really considered that second class of features at all. I also did not consider it when we purchased the ZED over the Realsense. Unfortunately, dealing with this is where we get screwed for purchasing a proprietary stereo solution. We don't really know anything about the matching algorithm or what kind of artifacts it produces other than what we can qualitatively observe, and even then we don't know the root cause so its hard to reproduce ourselves. Given that this is the case, I vote that we entirely ignore class 2 features for now. That is long-term goal for the simulation team. Ignoring these will mean our simulation is quite idealistic, so people should be aware of that when testing against sim. On the other hand, I think we should be able to nearly perfectly match class 1 conditions in simulation and that's what I'm really after with this feature request.

Okay, back to the actual problem. Let me address your proposed solutions.

Doing our own stereo matching seems like a poor option to me not only because of the workload overhead but also because of the fact that you may end up introducing new class 2 artifacts that don't actually exist in ZED's propriety algorithm
The built in plugin seems like it can address all of the parameters I was targeting in class 1, with one unfortunate exception. There is only one focal length so you are forced to have 1:1 aspect ratio clouds. This matters because we are limited in visibility at close range to AR tags by the smaller vertical focal length.

Of these two choices, I like option 2 much better. We just need to work around that caveat of the single focal length parameter. For starters, don't worry too much about the fact that you have a set of focal lengths for each eye. The ZED has a relatively small stereo baseline so I think its totally fine to just pick one eye's set. Now the question is what to do about the non-identity aspect ratio.

There is a simple, conservative solution that might allow us to proceed quickly. That solution is to just pick fy. This will make the vertical FOV faithful to reality and the horizontal FOV smaller than reality. Hopefully this isn't completely detrimental to our performance. If we are still seeing good navigation behavior with this limit than we can punt this issue down the road for our simulation people to deal with more adequately in a custom plugin implementation.

tl;dr: go with option 2

umrover / mrover-ros

Point cloud parameters #12