ZiadMansourM / photogrammetry

Photogrammetry: Close Range 3D scanning. Our graduation project 🎓
https://docs.scanmate.sreboy.com/
GNU General Public License v3.0
4 stars 0 forks source link

Debug - Triangulation #31

Open MohamedWZS opened 1 year ago

MohamedWZS commented 1 year ago

Issue for debugging and fixing Triangulation step

MohamedWZS commented 1 year ago
import numpy as np
import cv2

def triangulate_points(P1, P2, points1, points2):
    """
    Triangulate 3D points from corresponding 2D points in two images.

    Parameters
    ----------
    P1 : np.ndarray
        Projection matrix for the first camera (3x4).
    P2 : np.ndarray
        Projection matrix for the second camera (3x4).
    points1 : list of tuple or np.ndarray
        Corresponding 2D points in the first image.
    points2 : list of tuple or np.ndarray
        Corresponding 2D points in the second image.

    Returns
    -------
    points_3d : np.ndarray
        Triangulated 3D points (Nx4).
    """

    # Convert input points to homogeneous coordinates
    points1_homogeneous = cv2.convertPointsToHomogeneous(np.array(points1)).reshape(-1, 3)
    points2_homogeneous = cv2.convertPointsToHomogeneous(np.array(points2)).reshape(-1, 3)

    # Triangulate points using OpenCV's triangulatePoints function
    points_3d_homogeneous = cv2.triangulatePoints(P1, P2, points1_homogeneous.T, points2_homogeneous.T)

    # Convert the resulting homogeneous coordinates to Euclidean coordinates
    points_3d = cv2.convertPointsFromHomogeneous(points_3d_homogeneous.T).reshape(-1, 3)

    return points_3d

This function takes two projection matrices P1 and P2 (corresponding to two camera views), and the 2D coordinates of corresponding points in both images (points1 and points2). It uses OpenCV's triangulatePoints function to triangulate the 3D coordinates of the points.

To use this function in your photogrammetry pipeline:

  1. Obtain the camera's intrinsic matrix (K-matrix) for both images using the compute_k_matrix function.
  2. Estimate the relative pose (rotation matrix R and translation vector t) between the two camera views using a method like the 5-point algorithm with RANSAC, or another suitable approach. This can be done using the cv2.findEssentialMat and cv2.recoverPose functions in OpenCV.
  3. Compute the projection matrices P1 and P2 using the intrinsic matrices and the relative pose. For the first camera, P1 = K1 [I | 0]. For the second camera, P2 = K2 [R | t], where I is the 3x3 identity matrix, and 0 is a 3x1 zero vector.
  4. Identify and match corresponding points in both images using your feature matching algorithm.
  5. Pass the projection matrices and matched points to the triangulate_points function to obtain the triangulated 3D points.
  6. By following these steps and integrating the triangulate_points function into your photogrammetry pipeline, you can compute the 3D coordinates of points in the scene from corresponding 2D points in two images.
MohamedWZS commented 1 year ago

import numpy as np
import cv2

def generate_point_cloud(images: Images, K_matrix: np.ndarray) -> np.ndarray:
    point_cloud = []

    for feature_match in images.feature_matches:
        image_one = feature_match.image_one
        image_two = feature_match.image_two

        # Extract matched keypoints
        keypoints_one = np.array([image_one.keypoints[m.queryIdx].pt for m in feature_match.matches])
        keypoints_two = np.array([image_two.keypoints[m.trainIdx].pt for m in feature_match.matches])

        # Estimate the essential matrix
        E, mask = cv2.findEssentialMat(keypoints_one, keypoints_two, K_matrix, method=cv2.RANSAC, prob=0.999, threshold=1.0)
        _, R, t, _ = cv2.recoverPose(E, keypoints_one, keypoints_two, K_matrix)

        # Create projection matrices
        P1 = K_matrix @ np.hstack((np.eye(3), np.zeros((3, 1))))
        P2 = K_matrix @ np.hstack((R, t))

        # Triangulate points
        points_4D = cv2.triangulatePoints(P1, P2, keypoints_one.T, keypoints_two.T)
        points_3D = (points_4D / points_4D[3])[:3]

        point_cloud.append(points_3D)

    # Merge all point clouds into one
    point_cloud = np.hstack(point_cloud)

    return point_cloud

This function estimates the essential matrix and relative pose for each pair of images, creates the projection matrices, then triangulates the points. The resulting 3D points are appended to a list, and the final point cloud is formed by merging all individual point clouds.

Note that this function assumes that the camera is calibrated, and there's no distortion. If your camera has distortion, you should undistort the images and keypoints before proceeding with this pipeline.

MohamedWZS commented 1 year ago

The lines you mentioned create the projection matrices for two images involved in the feature matching process.

P1 = K_matrix @ np.hstack((np.eye(3), np.zeros((3, 1))))

This line creates the projection matrix P1 for the first image. The projection matrix is a 3x4 matrix that relates 3D world points to 2D image points. The matrix P1 is formed by multiplying the camera intrinsic matrix K_matrix (3x3) with the extrinsic matrix, which is the combination of the rotation matrix (3x3) and the translation vector (3x1). In this case, the extrinsic matrix is the identity matrix and a zero translation vector, representing that the first image's coordinate system is considered the world coordinate system.

P2 = K_matrix @ np.hstack((R, t))

This line creates the projection matrix P2 for the second image. Similarly, P2 is formed by multiplying the camera intrinsic matrix K_matrix with the extrinsic matrix. In this case, the extrinsic matrix is the combination of the rotation matrix R and the translation vector t, which are the relative pose (orientation and position) of the second image's camera with respect to the first image's camera.

MohamedWZS commented 1 year ago

The essential matrix is a 3x3 matrix that encodes the geometric relationship between two images taken with the same calibrated camera. It represents the epipolar constraint that exists between corresponding points in the two images. In other words, it relates the position and orientation of the two cameras in a stereo pair and allows you to compute the relative pose (rotation R and translation t) between them.

The relative pose is the transformation between the coordinate systems of the two cameras. The rotation matrix R (3x3) represents the orientation difference between the cameras, and the translation vector t (3x1) represents the position difference between their optical centers.

When I mentioned that the function assumes the camera is calibrated, it means that the camera's intrinsic parameters (focal length, principal point, and any distortion coefficients) are known and have been estimated beforehand. In other words, the K_matrix provided as input to the function should be accurate for the specific camera used to capture the images. A calibrated camera ensures that the geometric relationships between 3D world points and 2D image points are properly modeled, leading to more accurate 3D reconstruction.