STVIR / PMTD

Pyramid Mask Text Detector designed by SenseTime Video Intelligence Research team.
215 stars 220 forks source link

OHEM implementation? #16

Closed WeihongM closed 5 years ago

WeihongM commented 5 years ago

Hello, After read your paper, I have some question on your OHEM implementation. you mean the OHEM is used on the RPN stage? Do you used it only on the RPN? In my own understanding, you random sample from the RPN output, (maybe value N) and then put all the N proposals to calculate the sum loss, after get the loss, sorting, and choose Top 512 to update the network. I dont know whether my understanding is right, ask for your help, thanks.

JingChaoLiu commented 5 years ago

Sorry for the misleading expression in the paper. The sensences in the paper are:

OHEM: In the bounding box branch, we adopt the OHEM [38] to learn the hard samples. In our settings, we first sort the samples provided by RPN in the descending order of the sum of classification loss and location loss, then select the top 512 difficult samples to update the network.

What I want to express is that we perform OHEM in the bounding box branch. In the bbox branch, we first predict the object class and box offset for the anchors proposed by the RPN, then sort these anchors by the sum of cls_loss and reg_loss. The word RPN is only used to indicate where these anchors are from.

In other words, we change nothing for RPN, and perform OHEM in the bounding box branch.

WeihongM commented 5 years ago

Thanks for your reply, Liu. so in your words, you dont use ohem method to update the network on the roi box head stage, which is the rcnn head(second stage) . In my understanding, you use code balanced_positive_negative_sample.py which is in the origin facebook maskrcnn_benchmark implementation in the second stage to update. is it right?

JingChaoLiu commented 5 years ago

No, the bounding box branch refers to the rcnn, we only modify the rcnn. We didn't change any thing in RPN.

WeihongM commented 5 years ago

Sorry, maybe I speak not clearly, In some other works, I find ohem strategy mostly used in the second stage, in your work, it seems you use ohem only in the RPN network to update? dont know whether it matters. Find answer from Duplicate of #8 Thx.