mapillary / OpenSfM

Open source Structure-from-Motion pipeline
https://www.opensfm.org/
BSD 2-Clause "Simplified" License
3.34k stars 852 forks source link

[Help for Project] Reprojection using point cloud and json of OpenSfM #1004

Open JACMoutinho opened 1 year ago

JACMoutinho commented 1 year ago

Good afternoon, everyone

Wanted to see if it were possible for anyone to help me on a project I’m doing using OpenSfM. My objective is to make a projection from 3D to 2D, using .ply Point Cloud and write the respective matching pixel on an image. The problem I have right now is scaling.

image_test

The image above represents 10% of the points from the 1.6 million vertices point cloud. You can see that it has the shape of the pillar, but the scaling is all wrong. I’m using reconstruction.json from the undistorted folder and using EXIF library to export information from the images to create the camera intrinsic matrix.

Below, there's a zip with image, reconstruction.json and the ipynb code (also the code in .txt form):

iPYNB code.zip Projection.txt

I appreciate any help. Thank you in advance and have a wonderful weekend.

fabianschenk commented 1 year ago

Hi @JACMoutinho,

First of all, great that you even got so far :D Puh, there are a few things that could be wrong. 1) Do you transform from whatever world coordinate system the PCL lives in to your image? You'll need to transform first (from world to camera: T_cam_world)and then project into your image with the camera matrix K, something like p_img = K*T_cam_world* point_3d. 2) Another thing is that we typically use normalized coordinates, so you have to convert to the actual image pixels first. There are methods in the code for that 3) Is there some scaling involved in the pipeline? Your camera parameters might be scaled

Best, Fabian

JACMoutinho commented 1 year ago

Hello @fabianschenk

First of all, let me just thank you for your quick answer and taking your time to help me out.

  1. Yes, for the K i use the metadata taken with the EXIF directly from the image (specifically the focal length and the (width, height)

`f = img.focal_length C_x = int(image_src.shape[1]/2) C_y = int(image_src.shape[0]/2)

    width = int(image_src.shape[1])
    height = int(image_src.shape[0])
    f_x = f*img.x_resolution
    f_y = f*img.y_resolution
    int_matrix = np.array([[f_x, 0, C_x],
                              [0, f_y, C_y],
                              [0,   0,   1]])`

For the T_cam_world i use the information give by the reconstruction.json file in the undistorted folder of the openSfM to retrieve the Rotation and translation matrices from the camera poses, using then cv2.Rodrigues to get a 3x3 rotation matrix from the 1x3 rotation vector.

`for i,img_name in enumerate(images):

rotation = shots[img_name]['rotation']
translation = shots[img_name]['translation']
gps_position = shots[img_name]['gps_position']
rows_shots.append([img_name[:-4],rotation,translation,gps_position])

img_data = pd.DataFrame(rows_shots, columns=["Image","Rotation","Translation","GPS Position"])`

I then use that information to create my 4x4 projection matrix

`rotation_matrix_13 = img_data[img_data['Image'] == img_test]['Rotation'].to_numpy()[0]

    translation_matrix =  np.matrix(img_data[img_data['Image'] == img_test]['Translation'].to_numpy()[0]).T
    rotation_matrix = cv2.Rodrigues(np.matrix(rotation_matrix_13))[0]
    RT_matrix = np.append(rotation_matrix,translation_matrix,1)
    RT_matrix4x4 = np.vstack((RT_matrix,[0,0,0,1]))
    ones = np.array([[1,0,0,0], [0,1,0,0], [0,0,1,0]])
    P = int_matrix.dot(ones).dot(RT_matrix4x4)
    coords2D = P.dot(test_coord.T)
    coords2D = np.array(coords2D.T)
    u = coords2D[0] / coords2D[2]
    v = coords2D[1] / coords2D[2]`

For the 3D coordinates i use the merged.ply generated by the OpenSfM

`PLY_file = "merged.ply" vertices = [] with open(PLY_file, 'r') as f: line = f.readline().strip()

num_vertices = None
while line != 'end_header':
    elements = line.split()
    if elements[0] == 'element' and elements[1] == 'vertex':
        num_vertices = int(elements[2])
    line = f.readline().strip()
for _ in range(num_vertices):
    line = f.readline().strip()
    vertex = list(map(float, line.split()))
    vertices.append(vertex)`
  1. In terms of normalization, i use the pure values from the .ply, which i assume has the coordinates of the "world" in which the point cloud is built on (i guess a better way to put it, is that the coordinates follow its own coordinate system implemented by the OpenSfM).

my process to convert back to 2D is, taking into consideration the code i presented in point 1: P = K*[R|t] P = int_matrix.dot(ones).dot(RT_matrix4x4)

    coords2D = P.dot(test_coord.T)
    coords2D = np.array(coords2D.T)
    u = coords2D[0] / coords2D[2]
    v = coords2D[1] / coords2D[2]`
  1. In termos of camera matrix, i use the Focal Length in milimeters and use the resolution to transform that value in pixels. Same for the center C_x and C_y.

`f = img.focal_length C_x = int(image_src.shape[1]/2) C_y = int(image_src.shape[0]/2)

    width = int(image_src.shape[1])
    height = int(image_src.shape[0])
    f_x = f*img.x_resolution
    f_y = f*img.y_resolution
    int_matrix = np.array([[f_x, 0, C_x],
                              [0, f_y, C_y],
                              [0,   0,   1]])`

That's the only scaling i'm using. for the camera matrix, i'm only using the metadata from the image file.

image

I'm sorry if i'm all over the place with this explanation and i wasn't able to reply to your questions as you expected me to, i am fairly new on this subject.

EDIT: image

For consideration, can it be that translation and rotation from the reconstruction.json file uses a different reference than the one from the point cloud? if so, is there a default scaling value in the pipeline that i could use to "renormalize"/"denormalize" for 2D coordinates or 3D coordinates?

fabianschenk commented 1 year ago

Hi @JACMoutinho ,

The code you shared seems correct even though I didn't check all the details. Can you try to read the 3D points from the recontruction.json instead of the ply file?

Maybe you can also look through the code for examples on how to reproject. We're not working actively on OpenSfM anymore and the last time I used it was around 1,5 - 2 years ago, so I also don't remember all the details anymore.

Good luck, Fabian