Closed mapmeld closed 3 years ago
position=2 is the correct parameter. position=0 is returning the same result (because to access the hidden state, we subtract by one, so position=0 looks up element "-1", using python array indexing, that would point to the last item in the array, which is the same as position=2).
I'll have it throw an error if position=0 is entered. That's because position 0 is always an input token. The model did not generate anything in that position. So there are no probabilities associated with that position.
Does that make sense?
Yes, your explanation and proposed error makes sense to avoid this problem.
Great!
In the gpt2 model, I am measuring the distribution of calendar dates.
I assumed that to read predictions for the next token, I would need either
position=0
orposition=2
depending on whether it referred to the 0th token of the full string or the generated output. I was surprised to see these return the same tokens and probabilities:If I query
position=1
then I see 'the' and other tokens which might follow "On " in the original sentence.