For every input sentence yielded by the data loader, print a line with comma-separated floats corresponding to the ambiguity of each word.
Ambiguity can be obtained from the output of the softmax for each word:
If the softmax looks like this (V = vocab size)
[p(word_1), ..., p(word_V)]
then the ambiguity is
-p(word_1) log p(word_1) - ... - p(word_V) log p(word_V)
For every input sentence yielded by the data loader, print a line with comma-separated floats corresponding to the ambiguity of each word.
Ambiguity can be obtained from the output of the softmax for each word: