Calculate distances on the ground plane

cal-pratt commented 7 years ago

This is a proposed idea for calculating and determining the relative distance between multiple points on the ground plane. It combines data from the IMU and depth sensor to allow us to determine distances on the ground plane in respect to the ROV. This issue is being created to discuss possible implementations and hash out a full solution for detecting these distances.

Determining Depth By recording the pressure at the bottom of the pool we will be able to determine the distance from the surface to the ground plane. After this value is recorded, the distance from the ROV to the ground plane can be determined at any time by subtracting the distance from the surface to the ROV using the latest pressure reading.

Determining Position Using the accelerometer/gyroscope on the ROV and/or by placing the camera at a known angle, we will be able to determine the angle between the image plane's normal vector and the depth vector. Knowing the intrinsic properties of our camera will let us determine the angle between the image plane's normal vector and a target pixel. Consider the following case where the target pixel is below the center pixel (which is in line with the normal vector).

We now have three knowns in the solution- the height of the camera; the angle between the target and the depth vector, and that there is a right-angle between the ground plane and the depth vector. Using simple trig we can then calculate the distance of the target. This solution can then be easily expanded to calculate distances given both horizontal and vertical displacement of the target pixel.

**

FifoIronton commented 7 years ago

Are we currently doing anything to correct for the fish-eye lenses on the cameras? I feel like I've heard people talk about that, but am not sure.

cal-pratt commented 7 years ago

Are we currently doing anything to correct for the fish-eye lenses on the cameras? I feel like I've heard people talk about that, but am not sure.

Not yet, but we will need to. JavaCV is a wrapper around the OpenCV methods which will give us access to methods like findChessboardCorners and calibrateCamera. We'll also need a way to save the checkerboard images, camera distortion matrix, and the intrinsic camera values for each of the cameras

cal-pratt commented 7 years ago

Okay, So I've come up with a newer approach to this problem which does not require the IMU. After we calibrate the camera we will have the camera matrix, and the distortion coefficients of that camera. Once we have this we can remove the effects of radial distortion on the test image.

With the intrinsic camera values it is possible to determine the translational and rotational offsets of points on the ground plane by examining image-coordinate vs object-coordinate point pairs. Using Perspective-n-Point (PnP) algorithms we can determine the transformation that an object-coordinate must undergo to be projected onto the image-plane. By solving this set of equations, it is possible to determine the camera offset and rotation in terms of object coordinates. The object coordinates are arbitrary but must be in proper proportion to the real world. Assume that we choose one of the cargo containers and estimate a 90 degree angle with two equal sides.

This plot provides us with a set of pixel and object coordinate pairs. The translational and rotational vecotors can be obtained using the opencv_calib3d#solvePnP method.

opencv_calib3d.solvePnP(
    objectCoordinates, // found using gui
    targetCoordinates, // found using gui
    cameraMatrix, // found during calibration 
    distortionCoeffs, // found during calibration 
    rotationalVector, // returned by method 
    translationalVector); // returned by method

These vectors provide a way to get image coordinates from object coordinates, however, in this case we require object coordinates from image coordinates (we will later find a scalar to transform object units into world units). The tricky part here is that a 3D coordinate projects onto a 2D image plane in only one unique spot, however, an image coordinate maps into a 3D space as a ray (infinitely many locations). Because we know the interest point is restricted to where the ray intersects with the object plane we have enough information to solve this problem.

s = scalar, P_c = imagePoint , P_w = objectPoint, K = intrinsicMatrix, R = rotationMatrix, T = translationVector

sP_c = K(RP_w + T) sK^-1P_c = RP_w + T sR^-1K^-1P_c = P_w + R^-1T sR^-1K^-1P_c - R^-1T = P_w

// Ground plane coordinate must have 0 z components [sR^-1K^-1P_c - R^-1T] {z} = P_w{z} = zero s[R^-1K^-1P_c]{z} = [R^-1T] {z} s = [R^-1T] {z} / [R^-1K^-1P_c]{z} // plug s into sR^-1K^-1P_c - R^-1T solve for P_w{x, y}

Once this system is solved we can take any image point and project the point onto the object plane. Knowing the height of the camera from depth will allow us to translate the z component of the translationVector into meters, providing us with real measurements. This process will be repeated with many ~90 degree angles averaging the values that are close, removing the outliers.

Java code example:

final Mat P = new Mat(3, 1, opencv_core.CV_32F);
final DoubleIndexer PIndexer = P.createIndexer();
PIndexer.put(0, 0, x);
PIndexer.put(1, 0, y);
PIndexer.put(2, 0, 1); // make image point P homogeneous 

Mat T = translationalVector;
Mat R = rotationalMatrix;
Mat C = cameraMatrix;

Mat Rinv = R.inv().asMat();
Mat Cinv = C.inv().asMat();
Mat Rinv_Cinv = new Mat();
opencv_core.gemm(Rinv, Cinv, 1, new Mat(), 0, Rinv_Cinv);

Mat Cinv_P = new Mat();
opencv_core.gemm(Cinv, P, 1, new Mat(), 0, Cinv_P);

Mat Rinv_Cinv_P = new Mat();
opencv_core.gemm(Rinv, Cinv_P, 1, new Mat(), 0, Rinv_Cinv_P);

Mat Rinv_T = new Mat();
opencv_core.gemm(Rinv, T, 1, new Mat(), 0, Rinv_T);

DoubleIndexer i1 = Rinv_Cinv_P.createIndexer();
DoubleIndexer i2 = Rinv_T.createIndexer();
double s = i2.get(2,0)/i1.get(2,0);

Mat Rinv_Cinv_s_P = opencv_core.multiply(s, Rinv_Cinv_P).asMat();
Mat O = opencv_core.subtract(Rinv_Cinv_s_P, Rinv_T).asMat();

FifoIronton commented 7 years ago

Just for my own understanding, can you answer the following questions about the distance calculator tool? These aren't meant to be criticisms, I just think it's cool and want to understand a little more

I know we've talked about it, but I have no experience in Computer Vision and minimal experience with ROV competitions

Does this rely on taking a shot that contains all of the crates? It feels like these calculations are very dependent on depth, camera angle, and other stuff, so you'd need to capture everything in one shot and do it all at once
In your example, you make a guess that an angle is 90deg and that two lengths are equal. In real life are we making these guesses, or are we pulling all of our angles/lengths from the boxes because they are known?
How precise do these lines have to be, do you think? Will the Science Officer drawing a line at 2deg off or a little too short in the heat of the moment?

cal-pratt commented 7 years ago

Fair questions; questions are good 👍

Does this rely on taking a shot that contains all of the crates?

You only need can capture a few points of interest in each image you take. In the following image you'll need to have some overlapping points from the last image in order to locate the newer points with respect to the last.

In your example, you make a guess that an angle is 90deg and that two lengths are equal. In real life are we making these guesses, or are we pulling all of our angles/lengths from the boxes because they are known?

Right now we're making guesses.. I feel like there should be a way to determine the translation/ rotational matrix by just knowing 90 degree angles, but the math hasn't hit me yet. The idea right now is to draw multiple 90 degree triangles on the image and examine the rotational/ translational matrices caused by each triangle.

If we assign one triangle to be the origin we can use that to find the world coordinates of the other triangles. If this image-to-object projection causes the other triangle not to be 90 degrees, we can go back and tweak our inputs. In the following image we have 3 possible object origins:

Say red was chosen as the origin. We'd then have to project the purple and green image points onto the object plane we derived from selecting red (numbers made up).

We can then verify after the projection that the green and the purple triangles are still 90 degree angles with two equal sides using some trig. Once we're happy with the values on screen we can apply a weighting to the coordinate space by equating the pressure sensor height reading to the camera origin z in the object space (the z component of the translational vector).

How precise do these lines have to be, do you think? Will the Science Officer drawing a line at 2deg off or a little too short in the heat of the moment?

Still needs to be tested fully. I'm assuming the error would be proportional to the side and angle error on the guessed triangle. The larger the triangle, the easier it will be to place good points, the lower the error will be on the guessed triangle.

ConnorWhalen commented 7 years ago

This looks like it has a lot of potential. Our past relative measuring tools weren't able to account for all of the 3d-perspective-funkiness but the combination of sticking to the ground plane and those nice 90 degree references makes it a lot more feasible. One concern I have is that our current camera set up might have issues getting a wide, clean shot of everything you want to measure. It's no dealbreaker though

cal-pratt commented 7 years ago

Yeah but hopefully with the HD screen caps we get clear images. And you don't need everything in the same shot to get all the values. At a minimum you only need to have two overlapping points. Just have to use some simple "3 angles add to 180" logic to connect the two:

cal-pratt commented 7 years ago

Note for self https://stackoverflow.com/questions/14502777/opencv-solvepnp-barreldistoriton

EasternEdgeRobotics / Software_2017

Calculate distances on the ground plane #297