How about replacing the softmax loss like Face Recognition Community did?

AlexeyAB / darknet

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )

http://pjreddie.com/darknet/

Other

21.65k stars 7.96k forks source link

How about replacing the softmax loss like Face Recognition Community did? #512

Open Jumabek opened 6 years ago

Jumabek commented 6 years ago

Hi @AlexeyAB ,

I read in multiple places that Face Recognition community replacing traditional Softmax with Margin Softmax, Angular Margin, Cosine Margin and and getting astounding improvement in accuracy. Latest paper about one of such methods: https://arxiv.org/pdf/1801.07698.pdf

From what I grasped, those methods work on reducing the intra class distance and maximizing the inter class distance therefore giving significant improvement in verification accuracy.

Would you be interested in implementing such loss functions for YOLO as well?

I myself trying as well, however, my progress is quite slow .

AlexeyAB commented 6 years ago

Hi, @Jumabek

Did you try to implement additive angular margin (ArcFace) or CosineFace instead of Softmax?

ArcFace
Softmax

Did you try to use CUDA-code of CosineFace(amsoftmax)? https://github.com/deepinsight/insightface/blob/master/3rdparty/operator/amsoftmax.cu#L13-L36
Or did you try to use Python code of ArcFace? https://github.com/deepinsight/insightface/blob/40cb9b223a487b6b3aa19f0316d9f7a0b1e28af4/src/train_softmax.py#L195-L234

As I understand the main problem in Softmax that they want to solve:

However, the Softmax loss function does not explicitly optimise the features to have higher similarity score for positive pairs and lower similarity score for negative pairs, which leads to a performance gap.

So can ArcFace activation improve accuracy in detection tasks, or only in binary classification case?

Jumabek commented 6 years ago

@AlexeyAB , I am planning to implement CosineFace because it has similar performance with Arcface but mathematically a lot simpler. I specifically wanted to implement them in darkent because I have fire/smoke detector which simply does patch classification. After your question I now think maybe I am the only one who need it. Because it is for classification not for detection.

I didnt know there is a cuda code. I guess I will try to use CUDA-code of CosineFace(amsoftmax). In case of improvement in Fire/Smoke patch classification, I will report it here. But that is not a promise, because as of now I switched to Face Recognition project.

Thank you for your attention.

AlexeyAB commented 6 years ago

@Jumabek For the interest, what framework do you use for face recognition? And what is the most advanced framework for face recognition now?

Jumabek commented 6 years ago

@AlexeyAB , According to MegaFace (challenging FR dataset) there are 2 best methods in my fringe right now:

CosineFace - Caffe: https://github.com/happynear/AMSoftmax
ArcFace - MXNet: https://github.com/deepinsight/insightface

Currently I resolved to use CosineFace method of https://github.com/Joker316701882/Additive-Margin-Softmax implementation which is in python/TensorFlow