writer / fitbert

Use BERT to Fill in the Blanks
https://pypi.org/project/fitbert/
Apache License 2.0
82 stars 14 forks source link

How to handle likelihood when comparing across different token lengths? #3

Closed sam-writer closed 5 years ago

sam-writer commented 5 years ago

The way this algorithm works is by asking Bert for the likelihood of a token, given the surrounding tokens. But, if you have this situation:

fb = FitBert()

masked_string = "Why Bert, you're looking ***mask*** today!"
options = ['buff', 'handsome', 'quite strong']

ranked_options = fb.rank(masked_string, options=options)

Then we have an issue: the first two options are one token, but the second is two. That means we will get a likeliness score for buff (call it L(buff)), another for handsome (L(handsome)), but two for quite strong (L(quite), L(strong)).

While we can directly compare L(buff) to L(handsome), what do we do with L(quite) and L(strong)?

On the lines around this one, we use max to make likelihoods comparable across different token lengths. This may be wrong.

For this ticket, try to come up with good test cases, to see how max performs. Then, if max doesn't seem to work well, try to use other aggregation methods... if there is a theoretical basis for the choice, all the better.