Hi, I am getting an error while generating InferSent embeddings. The error is as follows, with details at the end of this email
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x86 in position 11: invalid start byte
The error occurs after I run infer_sent_embs.build_vocab(x_train, tokenize=True) .
Note that I ran your code in Google Colab. Also note that the links to InferSent in the python file infersent.py also need to be updated (expired links).
Hi, I am getting an error while generating InferSent embeddings. The error is as follows, with details at the end of this email
The error occurs after I run
infer_sent_embs.build_vocab(x_train, tokenize=True)
.Note that I ran your code in Google Colab. Also note that the links to InferSent in the python file infersent.py also need to be updated (expired links).
The new links are
INFERSENT_GLOVE_MODEL_URL = 'https://dl.fbaipublicfiles.com/infersent/infersent1.pkl' INFERSENT_FASTTEXT_MODEL_URL = 'https://dl.fbaipublicfiles.com/infersent/infersent2.pkl'
`
UnicodeDecodeError Traceback (most recent call last)