Softmax and Triplet loss

timesler / facenet-pytorch

Pretrained Pytorch face detection (MTCNN) and facial recognition (InceptionResnet) models

MIT License

4.42k stars 943 forks source link

Yeah, I am trying to use this for classifying groups of people, and I looked at the final layer given classification and saw that it is just outputting a linear layer. Shouldn't it be applying a softmax function on that output, for multinomial classification?

Edit: I see that it has Linear(512, num_classes), and it names the output variable as self.logits. I've read a bit about this, log(p/(1-p)), log odds, but would simply throwing 512 features into a linear function output log odds? Isn't an activation function needed here at the final layer?

timesler / facenet-pytorch

Softmax and Triplet loss #73