Closed ghost closed 6 years ago
In Caffe, the SigmoidCrossEntropyLoss layer combines both sigmoid activation and the binary cross entropy loss in one layer in order to provide numerical stability. At test/retrieval time, the sigmoid activation is added as a dedicated layer in order to get the output of fc8 in the correct range.
Thank you very much!
Does the code 'n.loss = L.SigmoidCrossEntropyLoss(n.fc8, n.phocs)' implies that cross entropy loss function was used with the output of last fully connected layer with linear activation?