Open MohamedWZS opened 1 year ago
import numpy as np
import cv2
def triangulate_points(P1, P2, points1, points2):
"""
Triangulate 3D points from corresponding 2D points in two images.
Parameters
----------
P1 : np.ndarray
Projection matrix for the first camera (3x4).
P2 : np.ndarray
Projection matrix for the second camera (3x4).
points1 : list of tuple or np.ndarray
Corresponding 2D points in the first image.
points2 : list of tuple or np.ndarray
Corresponding 2D points in the second image.
Returns
-------
points_3d : np.ndarray
Triangulated 3D points (Nx4).
"""
# Convert input points to homogeneous coordinates
points1_homogeneous = cv2.convertPointsToHomogeneous(np.array(points1)).reshape(-1, 3)
points2_homogeneous = cv2.convertPointsToHomogeneous(np.array(points2)).reshape(-1, 3)
# Triangulate points using OpenCV's triangulatePoints function
points_3d_homogeneous = cv2.triangulatePoints(P1, P2, points1_homogeneous.T, points2_homogeneous.T)
# Convert the resulting homogeneous coordinates to Euclidean coordinates
points_3d = cv2.convertPointsFromHomogeneous(points_3d_homogeneous.T).reshape(-1, 3)
return points_3d
This function takes two projection matrices P1 and P2 (corresponding to two camera views), and the 2D coordinates of corresponding points in both images (points1 and points2). It uses OpenCV's triangulatePoints function to triangulate the 3D coordinates of the points.
To use this function in your photogrammetry pipeline:
import numpy as np
import cv2
def generate_point_cloud(images: Images, K_matrix: np.ndarray) -> np.ndarray:
point_cloud = []
for feature_match in images.feature_matches:
image_one = feature_match.image_one
image_two = feature_match.image_two
# Extract matched keypoints
keypoints_one = np.array([image_one.keypoints[m.queryIdx].pt for m in feature_match.matches])
keypoints_two = np.array([image_two.keypoints[m.trainIdx].pt for m in feature_match.matches])
# Estimate the essential matrix
E, mask = cv2.findEssentialMat(keypoints_one, keypoints_two, K_matrix, method=cv2.RANSAC, prob=0.999, threshold=1.0)
_, R, t, _ = cv2.recoverPose(E, keypoints_one, keypoints_two, K_matrix)
# Create projection matrices
P1 = K_matrix @ np.hstack((np.eye(3), np.zeros((3, 1))))
P2 = K_matrix @ np.hstack((R, t))
# Triangulate points
points_4D = cv2.triangulatePoints(P1, P2, keypoints_one.T, keypoints_two.T)
points_3D = (points_4D / points_4D[3])[:3]
point_cloud.append(points_3D)
# Merge all point clouds into one
point_cloud = np.hstack(point_cloud)
return point_cloud
This function estimates the essential matrix and relative pose for each pair of images, creates the projection matrices, then triangulates the points. The resulting 3D points are appended to a list, and the final point cloud is formed by merging all individual point clouds.
Note that this function assumes that the camera is calibrated, and there's no distortion. If your camera has distortion, you should undistort the images and keypoints before proceeding with this pipeline.
The lines you mentioned create the projection matrices for two images involved in the feature matching process.
P1 = K_matrix @ np.hstack((np.eye(3), np.zeros((3, 1))))
This line creates the projection matrix P1 for the first image. The projection matrix is a 3x4 matrix that relates 3D world points to 2D image points. The matrix P1 is formed by multiplying the camera intrinsic matrix K_matrix (3x3) with the extrinsic matrix, which is the combination of the rotation matrix (3x3) and the translation vector (3x1). In this case, the extrinsic matrix is the identity matrix and a zero translation vector, representing that the first image's coordinate system is considered the world coordinate system.
P2 = K_matrix @ np.hstack((R, t))
This line creates the projection matrix P2 for the second image. Similarly, P2 is formed by multiplying the camera intrinsic matrix K_matrix with the extrinsic matrix. In this case, the extrinsic matrix is the combination of the rotation matrix R and the translation vector t, which are the relative pose (orientation and position) of the second image's camera with respect to the first image's camera.
The essential matrix is a 3x3 matrix that encodes the geometric relationship between two images taken with the same calibrated camera. It represents the epipolar constraint that exists between corresponding points in the two images. In other words, it relates the position and orientation of the two cameras in a stereo pair and allows you to compute the relative pose (rotation R and translation t) between them.
The relative pose is the transformation between the coordinate systems of the two cameras. The rotation matrix R (3x3) represents the orientation difference between the cameras, and the translation vector t (3x1) represents the position difference between their optical centers.
When I mentioned that the function assumes the camera is calibrated, it means that the camera's intrinsic parameters (focal length, principal point, and any distortion coefficients) are known and have been estimated beforehand. In other words, the K_matrix provided as input to the function should be accurate for the specific camera used to capture the images. A calibrated camera ensures that the geometric relationships between 3D world points and 2D image points are properly modeled, leading to more accurate 3D reconstruction.
Issue for debugging and fixing Triangulation step