Closed joyce9936 closed 9 months ago
GPT2 has a fixed context limit of 1024 tokens, so if you pass in text containing 5000 sentences in one go it will error out. I recommend you read the file line-by-line and then pass the sentences in batches. You can find an example here
I am trying to calculate the surprisal value by feeding in a txt file with about 5000 sentences. But there is an error message I encounter: IndexError: index out of range in self Can anyone help with this issue?
Here is the code:
Here is the error message:
Expected behavior: I would like to have the surprisal value for each word for the whole text file.
Thank you!