Closed mauryaland closed 3 months ago
@mauryaland thanks for fixing this!
To reproduce the problem that is fixed:
tagger: SequenceTagger = SequenceTagger.load("ner-fast")
sentence_1 = Sentence("Mr John Smith arrived")
sentence_2 = Sentence("Hey ho ho ho")
tagger.predict([sentence_1, sentence_2], force_token_predictions=True, return_probabilities_for_all_classes=True)
print(sentence_1[1])
print(sentence_1[1].get_tags_proba_dist("ner"))
print()
print(sentence_1[2])
print(sentence_1[2].get_tags_proba_dist("ner"))
The tag probability distribution gives a different probability of the predicted tag than the prediction. That is because it is using the last tag sequence (sentence_2) rather than the actual sentence. This is fixed in the PR.
There was a bug where in a batch, it took the last tag sequence of that batch from the Viterbi decoder instead of the tag sequence corresponding to its sentence.