What is the encoding type for the pre-trained word2vec models? When trying to load a pre-trained model file I get the following error, and I have not been successful in troubleshooting this.
(using Portuguese as an example here)
model = gensim.models.KeyedVectors.load_word2vec_format(
'pt/pt.bin,
binary=True,
)
Error message:
UnicodeDecodeError: 'utf8' codec can't decode byte 0x93 in position 0: invalid start byte
What is the encoding type for the pre-trained word2vec models? When trying to load a pre-trained model file I get the following error, and I have not been successful in troubleshooting this.
(using Portuguese as an example here)
Error message:
UnicodeDecodeError: 'utf8' codec can't decode byte 0x93 in position 0: invalid start byte