Can we remove `ohem` branches in the train net?

mahyarnajibi / SSH

SSH: Single Stage Headless Face Detector

Other

835 stars 280 forks source link

Can we remove `ohem` branches in the train net? #17

Open luoyetx opened 6 years ago

luoyetx commented 6 years ago

the ohem layers in m1@ssh_ohem shares weight with m1@ssh and only provide scores for anchor_target_layer, can we remove these layers and use scores from m1@ssh. It seems the same.

tim 20180109233123

ghost commented 6 years ago

@luoyetx you can remove if you prefer not to use ohem, or you can simply disable it in the configuration file. If you look at the code, ohem is first used to compute a softmax loss, which the probabilities of the classes are then fed into the detection module to sort labels out, hence online hard example mining. Hope I got the concept correct!

xuzijian commented 6 years ago

When I read the implementation, I got the same question with @luoyetx. During my own understanding, It seems that we can simplify OHEM branches.

foralliance commented 6 years ago

The function of ohem is online negative and positive mining

In anchor_target_layer.py, in# Subsample positives and# Subsample negatives section, you can see its effect.

Of course, I think this is just one way of implementing OHEM, and there must be other ways

luoyetx commented 6 years ago

We only need the scores to sort the training samples, the branches added in doesn't backpropagate the gradients, which I think can be removed. I run the experiments and get the same result.