hunto / image_classification_sota

Training ImageNet / CIFAR models with sota strategies and fancy techniques such as ViT, KD, Rep, etc.
Apache License 2.0
77 stars 13 forks source link

[Question] Could I use DIST in RetinaFace? #11

Closed 22ema closed 1 year ago

22ema commented 1 year ago

Could I use DIST in RetinaFace?

RetinaFace have only 2class(face, not face). so Pearson's correlation coefficient seems to be inefficient.

In summary, if the class is small, the dist is inefficient. Especially in the case of binary, it looks more inefficient. I wonder if the above opinion is correct.

hunto commented 1 year ago

Right. There are two loss terms in DIST, i.e., intra-class and inter-class losses. For inter-class loss, the information is limited with only two categories. For intra-class loss, it could still be effective since it transfers the relations among batch axis.

You could still try DIST on your task, as it is easy to implement. For simple task that is easy to converge, it is recommended to use a larger SoftMax temperature (e.g., T=4 in CIFAR-100), and the performance would be much better than the original T=1.

22ema commented 1 year ago

Thank you for the great response.

22ema commented 1 year ago

Sorry but I have one more question.

could it be used in bbox or landmark regression rather than class classification?

hunto commented 1 year ago

Sorry but I have one more question.

could it be used in bbox or landmark regression rather than class classification?

DIST relax the absolute approximation into relative approximation. However, the regression task requires absolute approximation. So I think it is unsuitable to use (at least only use) DIST in regression task.

22ema commented 1 year ago

Thank you for the great response.