Closed Hakuyume closed 5 years ago
Subtracting maximum element does not have any good effects on stability because Sigmoid is an element-wise operator. Unlike Softmax, It does not require sum of exps.
Sigmoid
Softmax
I see. Thank you. I will merge.
Subtracting maximum element does not have any good effects on stability because
Sigmoid
is an element-wise operator. UnlikeSoftmax
, It does not require sum of exps.