danielwatson6 / hate-speech-project

6 stars 1 forks source link

Add `ambiguity` runnable method in model #2

Closed danielwatson6 closed 4 years ago

danielwatson6 commented 4 years ago

For every input sentence yielded by the data loader, print a line with comma-separated floats corresponding to the ambiguity of each word.

Ambiguity can be obtained from the output of the softmax for each word:

If the softmax looks like this (V = vocab size)
[p(word_1), ..., p(word_V)]

then the ambiguity is
-p(word_1) log p(word_1) - ... - p(word_V) log p(word_V)