fix _all_scores_for_token in ViterbiDecoder

mauryaland commented 4 months ago

There was a bug where in a batch, it took the last tag sequence of that batch from the Viterbi decoder instead of the tag sequence corresponding to its sentence.

alanakbik commented 3 months ago

@mauryaland thanks for fixing this!

alanakbik commented 3 months ago

To reproduce the problem that is fixed:

tagger: SequenceTagger = SequenceTagger.load("ner-fast")

sentence_1 = Sentence("Mr John Smith arrived")
sentence_2 = Sentence("Hey ho ho ho")

tagger.predict([sentence_1, sentence_2], force_token_predictions=True, return_probabilities_for_all_classes=True)

print(sentence_1[1])
print(sentence_1[1].get_tags_proba_dist("ner"))

print()

print(sentence_1[2])
print(sentence_1[2].get_tags_proba_dist("ner"))

The tag probability distribution gives a different probability of the predicted tag than the prediction. That is because it is using the last tag sequence (sentence_2) rather than the actual sentence. This is fixed in the PR.

flairNLP / flair

fix _all_scores_for_token in ViterbiDecoder #3455