DLR-RM / AugmentedAutoencoder

Official Code: Implicit 3D Orientation Learning for 6D Object Detection from RGB Images
MIT License
338 stars 97 forks source link

distance estimation formula in auto_pose6d #72

Closed davideCremona closed 4 years ago

davideCremona commented 4 years ago

Hi,

I'm studying a little more your paper and I've a question: In the paper, you say that using the knowledge behind the pinhole camera model, you can find an estimation of the object distance using this formula: t_hat_real = t_hat_synthetic * bbox_ratio * focal_length_ratio

Looking at the code, this is implemented by the line 100 of codebook.py: z = diag_bb_ratio * mean_K_ratio * render_radius

but since mean_K_ratio is computed as:

K00_ratio = K_test[0,0] / K_train[0,0]  
K11_ratio = K_test[1,1] / K_train[1,1]  

mean_K_ratio = np.mean([K00_ratio,K11_ratio])

and K[0,0] is horizontal focal length in pixel units, K[1,1] is vertical focal length in pixel units I cannot relate the f_real/f_syn member of the paper formula. Are you using another method for computing the focal length ratio or it is an estimation of the real focal length ratio?

Thank you, Davide

MartinSmeyer commented 4 years ago

Hi @davideCremona,

in the paper the horizontal and vertical focal lenghts are assumed to be the same, which they approximately are for most cameras. In the code, we want to account for slightly different horizontal and vertical focal lenghts that some users might have and the mean is just an estimate of the two ratios that should be very close in practice.