Closed rononrun closed 4 years ago
Using tf.nn.softmax directly may cause GPU out of memory when testing big-size input, and I found that computing softmax manually can solve this problem. You may use tf.nn.softmax directly if you didn't run into this problem.
Hi psychopa4, I see. Ok got it! Thank you very much for the clarification sir.
Hi psychopa4, appreciate the great job you've done here!
I have a question with your implementation of the softmax in NLBlock. It appears that you have opted for manually computing softmax instead of using tf.nn.softmax (which you have commented out). Is there any reason for this?
In some cases during training, i have experienced f going to inf due to exponent of a pretty large number (something around exp(300+) would bring it to inf), while tf.nn.softmax will handle this properly.
Thanks!