facebookresearch / InferSent

InferSent sentence embeddings
Other
2.28k stars 470 forks source link

model.build_vocab(sentences, tokenize=True) is throwing error #91

Closed kartikpandey2 closed 6 years ago

kartikpandey2 commented 6 years ago

On executing model.build_vocab(sentences, tokenize=True) error UnicodeDecodeError: 'charmap' codec can't decode byte 0x90 in position 962: character maps to <undefined> is thrown. traceback : model.build_vocab(sentences, tokenize=True)

\models.py", line 139, in build_vocab self.word_vec = self.get_w2v(word_dict)

\models.py", line 110, in get_w2v for line in f:

vakul-singh commented 5 years ago

can anyone help me out,i am getting the same error!!