Closed ymzhang1919 closed 7 years ago
Why? The purpose of the epsilon is to avoid numerical instability.
I understand the purpose, but I don't understand how it works. Logits can be big negative numbers. How can you improve the numerical stability of the softmax() operation by adding a small positive number to logits?
On the other hand, adding a small positive number to softmax makes the log() operation more robust.
If I am wrong, can you explain it in detail? Thx.
You are right, I have fixed it.
It should be epsilon = tf.constant(value=1e-4)
softmax = tf.nn.softmax(logits) + epsilon