alexsax / 2D-3D-Semantics

The data skeleton from Joint 2D-3D-Semantic Data for Indoor Scene Understanding
http://3dsemantics.stanford.edu
Apache License 2.0
464 stars 67 forks source link

Order of pose transformations to align EXR with back-projected camera points #16

Closed meder411 closed 6 years ago

meder411 commented 6 years ago

I am trying to orient the back-projected pano pointclouds with the EXR ones. The pose file description is a little vague on the rotations and their order though.

I am confident that I've backprojected the points correctly (they are identical to the EXR pointclouds, just off by a rigid body transform).

Could you please articulate the rotation order concerning the two keys below:

"camera_original_rotation": #  The camera's initial XYZ-Euler rotation in the .obj, 
"camera_rt_matrix": #  The 4x3 camera RT matrix, stored as a list-of-lists,

Letting camera_rt_matrix be the 3x3 array R and the 3 array t and the camera_original_rotation be the 3x3 array A, how do I align the back-projected points with the EXR pointcloud?

proj_pts is the Nx3 points back-projected from the pano exr_pts is the Nx3 points loaded from the EXR file

I don't understand what "The camera's initial XYZ-Euler rotation in the .obj" means. Where is that applied?

Typically, I'd have thought that

transformed_proj_pts = np.matmul(R, proj_pts[...,None]).squeeze() + t

should do the trick. That's just applying the camera-coordinates-to-world-coordinates transform R * X + t. But the alignment is still off. I have to assume there is some application of the A matrix (camera_original_rotation) that I'm missing. I've spent a day trying various permutations and transposes of these transforms and visualizing the results, but I can't get the pointclouds aligned. What's the correct way to do this?

Note: my A matrix is formed as:

# eul is the 3-vector loaded from the 'camera_original_rotation'
phi = eul[0] 
theta = eul[1]
psi = eul[2]
ax = np.array([
    [1,0,0],
    [0, cos(phi), -sin(phi)],
    [0, sin(phi), cos(phi)]])
ay = np.array([
    [cos(theta), 0, sin(theta)],
    [0,1,0],
    [-sin(theta), 0, cos(theta)]])
az = np.array([
    [cos(psi), -sin(psi), 0],
    [sin(psi), cos(psi), 0],
    [0,0,1]])
A = np.matmul(np.matmul(az, ay), ax)

UPDATE: I ran a brute force approach where I computed all possible combinations and permutations of +/-t, +/-c, R/R^T, and A/A^T. None of them resulted in even a near-perfect match. I'm thoroughly confused.

meder411 commented 6 years ago

For anyone else who struggles with this, the alignment solution is:

x = ARAx' + c

Where

x' := back-projected points
x := exr points
A := rotation matrix from Euler angles
R := rotation from Rt matrix
c := camera center

Explanation: AR converts the rotation from the camera coordinate system to the adjusted camera coordinate system using the A matrix formed by the Euler angles given in camera_original_rotation. AR thus is a rotation from the adjusted camera coordinate system to the global coordinate system. Ax' converts the back-projected points to the adjusted camera coordinate system. Therefore, the composition ARAx' transforms the back-projected points to the global coordinate system. c is already a point in the global coordinate system, so now we just need to shift the transformed points to their proper location at the camera center.

anita94 commented 3 years ago

@meder411 Thanks a lot for the solution, I was really struggling with almost the same problem and your post really helped me!! I just needed to rotate the points 90 degree more, in order to get what I needed