michalpolic / hololens_mapper

The mapping pipeline for HoloLens device
15 stars 3 forks source link

Doubt in the ModelsAligner method. #5

Closed rakshith95 closed 2 years ago

rakshith95 commented 2 years ago

I was going through the code for the models aligner, and I had a doubt on how the models are aligned. As I understand it, the estimate_euclidean_transformation function (https://github.com/michalpolic/hololens_mapper/blob/master/src/utils/UtilsMath.py#L190) does the alignment; however, it seems to return a similarity transformation with the scale and not a Euclidean transformation.

Formulas (40-42) for the transformation in: https://web.stanford.edu/class/cs273/refs/umeyama.pdf seem a little different from the ones used in the function. Is it based off a different reference or is it effectively the same formula? For instance, shouldn't the formula in lines 196,197 for the norm be: normc= np.sum(np.sqrt(np.sum(np.multiply(c0,c0),axis=0))) instead since the one in the code right now: normc = np.sqrt(np.sum(np.multiply(c0,c0))) would add the squares of all the coordinates of all the points and then take the square root rather than sum of norms of the points?

EDIT: I guess it would be axis=0 or axis=1 depending on how the data is stored

michalpolic commented 2 years ago

Hi @rakshith95. Thank you for checking of the codes. You are absolutely right about the name of the function, i.e., it should be estimate_similarity_transformation. I created simple unit test here:

import numpy as np
from scipy.spatial.transform import Rotation

# assumed similarity transformation
gt_R = Rotation.random().as_matrix()
gt_t = np.matrix(np.random.rand(3,1))
gt_s = np.random.rand(1,1)
print(f'Original transformation: \ns={gt_s} \nt={gt_t} \nR = {gt_R}\n\n')

# input data
X_transformed = np.matrix(np.random.rand(3,5))      # points to be transformed
X_ref = gt_s * gt_R * X_transformed + gt_t          # reference points

# process
mc = np.mean(X_transformed, axis=1)
mh = np.mean(X_ref, axis=1)
c0 = X_transformed - mc
h0 = X_ref - mh
normc = np.sqrt(np.sum(np.multiply(c0,c0)))
normh = np.sqrt(np.sum(np.multiply(h0,h0)))
normc2 = np.sum(np.sqrt(np.sum(np.multiply(c0,c0),axis=0)))
normh2 = np.sum(np.sqrt(np.sum(np.multiply(h0,h0),axis=0)))
c0 = c0 * (1/normc)
h0 = h0 * (1/normh)
c02 = c0 * (1/normc2)
h02 = h0 * (1/normh2)

A = h0 * c0.T
U, S, V = np.linalg.svd(A)    
R = U * V

A2 = h02 * c02.T
U2, S2, V2 = np.linalg.svd(A2) 
R2 = U2 * V2

if np.linalg.det(R) < 0:
    V[2,::] = -V[2,::]
    S[2] = -S[2]
    R = U * V

if np.linalg.det(R2) < 0:
    V2[2,::] = -V2[2,::]
    S2[2] = -S2[2]
    R2 = U2 * V2

s = np.sum(S) * normh / normc
s2 = np.sum(S2) * normh2 / normc2

t = mh - s*R*mc
t2 = mh - s2*R2*mc

print(f'Found transformation: \ns={s} \nt={t} \nR = {R}\n\n')
print(f'Proposed transformation: \ns={s2} \nt={t2} \nR = {R2}\n\n')

Please try it. As the result, when the transformation is done by proposed approach it does not lead to original rotation, scale and translation. I did not have time to check the math. I will look at it in next days.

rakshith95 commented 2 years ago

@michalpolic I implemented the formulas exactly as in the paper, and tested it:

import numpy as np
from scipy.spatial.transform import Rotation

# assumed similarity transformation
gt_R = Rotation.random().as_matrix()
gt_t = np.matrix(np.random.rand(3,1))
gt_s = np.random.rand(1,1)
print(f'Original transformation: \ns={gt_s} \nt={gt_t} \nR = {gt_R}\n\n')

# input data
X_transformed = np.matrix(np.random.rand(3,5))      # points to be transformed
X_ref = gt_s * gt_R * X_transformed + gt_t          # reference points

# process
mc = np.mean(X_transformed, axis=1)
mh = np.mean(X_ref, axis=1)
c0 = X_transformed - mc
h0 = X_ref - mh
normc = np.sqrt(np.sum(np.multiply(c0,c0)))
normh = np.sqrt(np.sum(np.multiply(h0,h0)))
c0 = c0 * (1/normc)
h0 = h0 * (1/normh)

A = h0 * c0.T
U, S, V = np.linalg.svd(A)    
R = U * V

if np.linalg.det(R) < 0:
    V[2,::] = -V[2,::]
    S[2] = -S[2]
    R = U * V

s = np.sum(S) * normh / normc
t = mh - s*R*mc

# AS IN PAPER
c0 = X_transformed - mc
h0 = X_ref - mh

sigma_sqX = (1/X_transformed.shape[1])*((np.linalg.norm(c0))**2)
sigma_sqY = (1/X_ref.shape[1])*((np.linalg.norm(h0))**2)

covar_mat = np.zeros((3,3), dtype=np.float32)
for i in range(X_transformed.shape[1]):
    covar_mat += h0[:,i] @ c0[:,i].T

covar_mat = (1/X_transformed.shape[1])*covar_mat
U,D,VT = np.linalg.svd(covar_mat)
D = np.diag(D)
S = None
if np.linalg.det(covar_mat)>=0:
    S = np.eye(3)
else:
    S = np.diag([1,1,-1])

R2 = U@VT
if np.linalg.det(U)*np.linalg.det(V)-1<=0.000001:
    R2 = U@VT
    S = np.eye(3)
elif np.linalg.det(U)*np.linalg.det(V)-(-1)<=0.0000001:
    S = np.diag([1,1,-1])
    R2 = U@S@VT

s2 = (1/sigma_sqX)*np.trace(D@S)
t2 = mh - s*R*mc

print(f'Found transformation: \ns={s} \nt={t} \nR = {R}\n\n')
print(f'Proposed transformation: \ns={s2} \nt={t2} \nR = {R2}\n\n') 

And the results are the same as that in the code; i.e. they seem to give the same as the input R,T,s. So I guess they are just slightly different implementations.