THU-DA-6D-Pose-Group / GDR-Net

GDR-Net: Geometry-Guided Direct Regression Network for Monocular 6D Object Pose Estimation. (CVPR 2021)
https://github.com/THU-DA-6D-Pose-Group/GDR-Net
Apache License 2.0
277 stars 47 forks source link

About the region loss #103

Open igodrr opened 1 year ago

igodrr commented 1 year ago

Hi authers: About the region loss loss_dict["loss_region"] = loss_func( out_region * gt_mask_region[:, None], gt_region * gt_mask_region.long() ) / gt_mask_region.sum().float().clamp(min=1.0) I think out_region * gt_mask_region[:, None] can't remove the loss of background. This operation makes all the channels of the background point to 0, which will all become 1/65 after passing through the softmax of CE_LOSS. I don't know if my understanding is correct, hope to get your reply! Thanks very much!

wangg12 commented 1 year ago

Hi, this formulation does not remove the loss of background, but tries to map the background to a "background" region. For truly remove the background loss, one might need to use the "ignore" trick. You can try that out. But I think it would not make a big difference.

igodrr commented 1 year ago

Hi, thanks for your reply! But I think this can't map the background to a "background" region as I mentioned above, make all the channels of a pixel to 0 will make them all become 1/65 after passing through the softmax, so the loss of this "background pixel" is log(1/65). I don't quite understand what you mean by mapping the background into the "background" area. Is the purpose of this operation to eliminate the interference of the background? If yes, why not just ignore it? Looking forward to your reply! Thanks very much!