vasgaowei / pytorch_MELM

The pytorch implementation of the Min-Entropy Latent Model for Weakly Supervised Object Detection
104 stars 19 forks source link

Are the refine_loss_1 and refine_loss_2 defined according to Accumulated Recurrent Learning? #11

Open jiafw opened 5 years ago

jiafw commented 5 years ago

I read Fang Wan's paper and your code.And in your code:

loss = cls_det_loss / 20 + refine_loss_1*0.1 + refine_loss_2*0.1

I think cls_det_loss / 20 is a global entropy models, and the two refine_loss are local entropy models, 0.1 is the regularization weight. And refine_loss_1 and refine_loss_2 are in different object localization branches,which is according the "Accumulated Recurrent Learning" in paper.Are these right?

By the way, I also want to know the bbox_pred whether used in Train mode.I see you explain the mean of "bbox_pred = bbox_pred[:,:80]" in #9. But I'm still a little confused, because I print the bbox_pred when training , and I find the values are all 0. So the bbox_pred is only used in Test mode?

Look forward your reply,thank you.

vasgaowei commented 5 years ago
  1. You are right. The cls_det_loss/20 is the global entropy loss for global entropy model. And 'refine_loss_1' and 'refine_loss_2' are losses for localization branch 1 and localization branch 2.
  2. And in faster-rcnn, we have gt_truth bounding box as supervision, so we can regress the proposals. But in Weakly-supervised object detection, we only have image-level-label. We don't have gt_truth bounding box. So we can't regress the proposals. My code is based on faster-rcnn, and in order not to delete so many lines of code and not to make many modifications to the original faster-rcnn code, I just set the values of bbox_pred to 0. It means that we don't regress the propsals. Thank you!
jiafw commented 5 years ago

I have another question for a long time. Where is the clique partition in your code?

I notice that this function is kind of like the clique partition process. def get_refine_supervision(self, refine_prob, ss_boxes, image_level_label):

But it only help the object localization loss. The global loss i.e., object discovery loss define like that: label = torch.tensor(label, dtype=det_cls_prob.dtype, device=det_cls_prob.device) zeros = torch.zeros(det_cls_prob.shape, dtype=det_cls_prob.dtype, device=det_cls_prob.device) max_zeros = torch.max(zeros, 1-F.mul(label, det_cls_prob)) cls_det_loss = torch.sum(max_zeros) self._losses['cls_det_loss'] = cls_det_loss / 20

I think the max_zeros can help two cases define the loss function. The one is that det_cls_prob vaules are in (-1,+∞) when lable vaule is -1. The other is that det_cls_prob vaules are in (-∞,1) when lable vaule is 1.And det_cls_prob is the scores matrix which sum the all proposals' scores per categorie in an image. And we this loss function can active the positive class which scores less than 1,and deactivate the negative class which scores more that -1. Is the loss fuction equival with the clique partition?

Thank you for your detailed answer last time. And the questions are a bit long this time. Perhaps others have the same questions. Hoping that we can get your help. Thank you very much!

b03505036 commented 4 years ago

I have another question for a long time. Where is the clique partition in your code?

I notice that this function is kind of like the clique partition process. def get_refine_supervision(self, refine_prob, ss_boxes, image_level_label):

But it only help the object localization loss. The global loss i.e., object discovery loss define like that: label = torch.tensor(label, dtype=det_cls_prob.dtype, device=det_cls_prob.device) zeros = torch.zeros(det_cls_prob.shape, dtype=det_cls_prob.dtype, device=det_cls_prob.device) max_zeros = torch.max(zeros, 1-F.mul(label, det_cls_prob)) cls_det_loss = torch.sum(max_zeros) self._losses['cls_det_loss'] = cls_det_loss / 20

I think the max_zeros can help two cases define the loss function. The one is that det_cls_prob vaules are in (-1,+∞) when lable vaule is -1. The other is that det_cls_prob vaules are in (-∞,1) when lable vaule is 1.And det_cls_prob is the scores matrix which sum the all proposals' scores per categorie in an image. And we this loss function can active the positive class which scores less than 1,and deactivate the negative class which scores more that -1. Is the loss fuction equival with the clique partition?

Thank you for your detailed answer last time. And the questions are a bit long this time. Perhaps others have the same questions. Hoping that we can get your help. Thank you very much!

Hi, is the loss something wrong? According to the paper, it seems to be different with paper's entropy loss.