umrover / mrover-ros

MRover ROS Source Code
https://mrover.org
GNU General Public License v3.0
27 stars 18 forks source link

Point cloud parameters #12

Closed niwhsa9 closed 1 year ago

niwhsa9 commented 2 years ago

@arschallwig can you summarize what you found out so far? I remember that you were having trouble finding a way to actually simulate the point cloud parameters for a stereo cam due to how the library represents the parameters.

arschallwig commented 2 years ago

Yeah I'll write something up this evening

arschallwig commented 2 years ago

@niwhsa9 When trying to accurately recreate the ZED 2i in ROS, I came across a few points preventing an obvious clear route:

niwhsa9 commented 2 years ago

Thanks for that great summary and set of options, very nice work.

I've written out some of my thoughts below. I apologize in advanced because this got very long-winded and some of it is not directly related to the issue at hand, but I promise I have some practical points in here.

Lets start with the tangent:

I think fundamentally there are two classes of things we are interested in simulating accurately when it comes to depth cameras:

  1. Certain physical/intrinsic parameters. Namely vertical FOV (or focal), horizontal FOV (or focal) and minimum/maximum range. I actually don't think we care about things like distortion coefficients, centers, and skews because ZED undistorts prior to producing output data anyway
  2. Stereo matching algorithm artifacts. This could be any number of things. For example, I've heard of stereo algorithms that enforce consistency constraints in depth gradients. This means that at the edges of discrete objects there is essentially an interpolation that occurs from the object to the background. I actually haven't seen this in ZED so I don't think it does this particular thing. One thing I have actually seen are random missing patches in point clouds, likely because ZED won't produce readings when the match falls below some certainty metric.

When I assigned you this issue I had not really considered that second class of features at all. I also did not consider it when we purchased the ZED over the Realsense. Unfortunately, dealing with this is where we get screwed for purchasing a proprietary stereo solution. We don't really know anything about the matching algorithm or what kind of artifacts it produces other than what we can qualitatively observe, and even then we don't know the root cause so its hard to reproduce ourselves. Given that this is the case, I vote that we entirely ignore class 2 features for now. That is long-term goal for the simulation team. Ignoring these will mean our simulation is quite idealistic, so people should be aware of that when testing against sim. On the other hand, I think we should be able to nearly perfectly match class 1 conditions in simulation and that's what I'm really after with this feature request.

Okay, back to the actual problem. Let me address your proposed solutions.

  1. Doing our own stereo matching seems like a poor option to me not only because of the workload overhead but also because of the fact that you may end up introducing new class 2 artifacts that don't actually exist in ZED's propriety algorithm
  2. The built in plugin seems like it can address all of the parameters I was targeting in class 1, with one unfortunate exception. There is only one focal length so you are forced to have 1:1 aspect ratio clouds. This matters because we are limited in visibility at close range to AR tags by the smaller vertical focal length.

Of these two choices, I like option 2 much better. We just need to work around that caveat of the single focal length parameter. For starters, don't worry too much about the fact that you have a set of focal lengths for each eye. The ZED has a relatively small stereo baseline so I think its totally fine to just pick one eye's set. Now the question is what to do about the non-identity aspect ratio.

There is a simple, conservative solution that might allow us to proceed quickly. That solution is to just pick fy. This will make the vertical FOV faithful to reality and the horizontal FOV smaller than reality. Hopefully this isn't completely detrimental to our performance. If we are still seeing good navigation behavior with this limit than we can punt this issue down the road for our simulation people to deal with more adequately in a custom plugin implementation.

tl;dr: go with option 2