wrong corner loss calculation

charlesq34 / frustum-pointnets

Frustum PointNets for 3D Object Detection from RGB-D Data

Apache License 2.0

1.59k stars 538 forks source link

wrong corner loss calculation #60

Open lzljzys opened 5 years ago

lzljzys commented 5 years ago

https://github.com/charlesq34/frustum-pointnets/blob/2ffdd345e1fce4775ecb508d207e0ad465bcca80/models/model_util.py#L384-L385

Currently, the corner points used to calculate corners_dist is a mix of both gt and gt_flip, which seems to be incorrect. It should be only from one of these two, i.e. the one with minimum distance sum over 8 points.

PranjalBiswas commented 5 years ago

Hi, as far as I can understand it is actually the minimum distance itself which is evaluated, as you have mentioned.

kwea123 commented 5 years ago

tf.minimum takes the min of the two quantities.

lzljzys commented 5 years ago

tf.minimum takes the min of the two quantities.

hmm not sure what you mean...

Here is an example which things can go wrong: when tf.norm(corners_3d_pred - corners_3d_gt, axis=-1) = [2,2,2,2,2,2,2,2] and tf.norm(corners_3d_pred - corners_3d_gt_flip, axis=-1) = [3,1,1,1,1,1,1,1] then in the current implementation corners_dist = [2,1,1,1,1,1,1,1]

I think the correct corners_dist should be [3,1,1,1,1,1,1,1]

PranjalBiswas commented 5 years ago

@lzljzys, if I am correct euclidean norm is calculated as square root of sum of squared distances. The distances here refer to the distance between corner points of predicted and gt, and predicted and fipped gt bounding box. So tf.norm does not give out an array but a 1*1 float type tensor. After which tf.minimum evaluates the smallest of the 2.

lzljzys commented 5 years ago

@lzljzys, if I am correct euclidean norm is calculated as square root of sum of squared distances. The distances here refer to the distance between corner points of predicted and gt, and predicted and fipped gt bounding box. So tf.norm does not give out an array but a float type tensor. After which tf.minimum evaluates the smallest of the 2.

shape of corners_3d_pred is Bx8x3, so if I understand correctly, shape of tf.norm(corners_3d_pred - corners_3d_gt, axis=-1) should be Bx8, so it does give out an array/matrix. Which results in the issue in the example I gave above.

PranjalBiswas commented 5 years ago

I have not checked but I doubt the shape will be B*8. May be you can refer the following link for more understanding about norms,

https://en.wikipedia.org/wiki/Matrix_norm

From here it can be seen that a euclidean norm is the maximum singular value of the matrix. So its a single value not an array.

lzljzys commented 5 years ago

I have not checked but I doubt the shape will be B*8. May be you can refer the following link for more understanding about norms,

https://en.wikipedia.org/wiki/Matrix_norm

From here it can be seen that a euclidean norm is the maximum singular value of the matrix. So its a single value not an array.

Just try it... This is what I get, the result is B*8, where B=32 >>> corners_3d_pred = tf.zeros([32, 8, 3]) >>> tf.norm(corners_3d_pred, axis=-1) <tf.Tensor 'norm_1/Squeeze:0' shape=(32, 8) dtype=float32>

kwea123 commented 5 years ago

Alright I understand it now; I also thought it outputs B*1...

You are right, the shape is B*8, and the correct way should be to apply a tf.reduce_mean or tf.reduce_max to combine these 8 corner distances before taking tf.minimum of the two possible flips ?

What do you think?