av-savchenko / face-emotion-recognition

Efficient face emotion recognition in photos and videos
Apache License 2.0
692 stars 127 forks source link

about multi-task learning #28

Closed kulich-d closed 1 year ago

kulich-d commented 1 year ago

Hi, Thank you for your great job!

I've read your paper, and I have a question about multi-task learning.

Am I right that during learning all new heads (emotions, age, gender, etc) "CMM lower layers" are freezed, so meaning you've trains only the heads for every task? So, am right that age, gender and ethnicity don't influence face emotion recognition features? As I see, you've used efficientnet like backbone adding dense layer like a classifier and train this architecture to get emotion labeling. I can't understand how adding of face attribute recognition task influence the FER accuracy, counting that during the training the common part of the NN architecture is frozen:(

av-savchenko commented 1 year ago

Thanks for your question. In fact, we used this training pipeline for our MobileNet model. After training the model on face identification, we added a new hidden layer and an output layer to predict age and gender. Finally, the classification layers was removed, and the emotion classification branch was added after the hidden layer (trained on age and gender). However, you're correct, our best EfficientNet models do not use such intermediate training of age and gender. If you take a look at our recent paper, e.g., from CVPR22 Workshop, you could notice that this step with age and gender is not mentioned: "The last layer of the network pre-trained on VGGFace2 is replaced by the new head"

kulich-d commented 1 year ago

Thank you so much, for such a quick answer