SamsungLabs / imvoxelnet

[WACV2022] ImVoxelNet: Image to Voxels Projection for Monocular and Multi-View General-Purpose 3D Object Detection
MIT License
283 stars 29 forks source link

Question about the aggregated binary mask #39

Closed jianingwangind closed 2 years ago

jianingwangind commented 2 years ago

Great work and thanks for sharing codes. I have a question about the aggregated binary mask, specifically according to the paper, the sum will be set to one if the resulting value is 0. image

But in the codes, i didn't find the corresponding impletation. https://github.com/saic-vul/imvoxelnet/blob/3512e89ca98e48aebb21a4c9e9fbe5037220b3a4/mmdet3d/models/detectors/imvoxelnet.py#L71 here the summed masks are directly used for the average calculation. Could you please give me some insights?

Thanks.

filaPro commented 2 years ago

Hi @jianingwangind,

I think in paper we write 1 here to avoid dividing 0 by 0. And in code we set not valid voxel values to zero. If i'm not missing something, it should be the same.

jianingwangind commented 2 years ago

Thanks for the reply.

Assuming that i understand correctly, if you don't set 1 for 0 summing values in advance, the division already return nan before you set invalid voxel values to zeros, which will interrupt the training.

filaPro commented 2 years ago

We are getting nan and in the next line replace it with 0, so the overall voxel volume is fine.

jianingwangind commented 2 years ago

I see. Thanks for your help.