Closed tsung-jui-wu closed 2 years ago
The model is indeed trained with binary cross-entropy (BCE) loss. The reason why there's no sigmoid layer in the model definition is that, in PyTorch, there's a loss that combines both sigmoid and BCE.
Thanks for the reply, it works wonders
According to your picture, the final layer of the model should decide whether the input is true or fake (0/1), but the model
spatiotemporal_net.py
does not have a softmax layer at the end. Instead, the output of the model is a number that ranges across -50~50 (or some other number that isn't between 0 and 1). Does this mean that there is another way to calculate the loss of a classifier when training? If there indeed is another loss function besides binary cross entropy, can you share it with us?