Closed oroszgy closed 7 years ago
Thank you for your pull request and welcome to our community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. In order for us to review and merge your code, please sign up at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need the corporate CLA signed.
If you have received this in error or have any questions, please contact us at cla@fb.com. Thanks!
Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Facebook open source project. Thanks!
Hi, thanks for using SentEval. In this particular case, we actually don't need a softmax here, because it's already in the "CrossEntropyLoss" ( http://pytorch.org/docs/master/nn.html#torch.nn.CrossEntropyLoss ) . If the loss had been the "NLLLoss" then indeed we would have needed a softmax before. Thanks
@aconneau Thanks for reviewing my PR. You are right, that none of the classification model needs an additional SoftMax layer. However, when you want to use your trained model, especially if you need probability values, then we need to run through the values over the softmax function. As far as I understood the code, predict_proba
should omit probability values, that is why added the softmax calculation only to that method.
Can I ask you to take another look at the PR and if necessary could you please revise your decision?
Thanks in advance!
You are right.
vals = F.softmax(self.model(Xbatch).data.cpu().numpy())
should be:
vals = F.softmax(self.model(Xbatch).data.cpu()).numpy()
instead? The output of the one above may be a torch Tensor, while below is a numpy array? (maybe it handles automatically, please confirm). Did you check with the simple bow.py script that it didn't introduce any bug?
@aconneau sorry for the very late reply. The mentioned conversion is handled automatically and examples/bow.py
runs flawlessly.
There was a bug in the
predict_proba
method as calling the finalsoftmax
was missing