yuanyao366 / PRP

Apache License 2.0
40 stars 10 forks source link

About softmax #2

Closed 321hallelujah closed 3 years ago

321hallelujah commented 3 years ago

hi,
when training in self-supervised and fine-tune stage. your work dont use softmax after linear layer, which some other related work is used. So,my question is,: In training, what is the difference when softmax is applied or not. Best wishes.

yuanyao366 commented 3 years ago

We use the "nn.CrossEntropyLoss()" which combines nn.LogSoftmax() and nn.NLLLoss() in one single class.