BUPTLdy / Sentiment-Analysis

Chinese Shopping Reviews sentiment analysis
http://buptldy.github.io/2016/07/20/2016-07-20-sentiment%20analysis/
351 stars 168 forks source link

UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte #17

Open shawnwang95 opened 6 years ago

shawnwang95 commented 6 years ago

Prefix dict has been built succesfully. Traceback (most recent call last): File "predict.py", line 23, in lstm_predict(sentence) File "code/Sentiment_lstm.py", line 187, in lstm_predict data=input_transform(string) File "code/Sentiment_lstm.py", line 173, in input_transform model=gensim.models.Word2Vec.load_word2vec_format('lstm_data/Word2vec_model.pkl', binary = True, unicode_errors='ignore') File "/anaconda3/lib/python3.6/site-packages/gensim/models/word2vec.py", line 1172, in load_word2vec_format header = utils.to_unicode(fin.readline(), encoding=encoding) File "/anaconda3/lib/python3.6/site-packages/gensim/utils.py", line 217, in any2unicode return unicode(text, encoding, errors=errors) UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte

Tried "model=gensim.models.Word2Vec.load_word2vec_format('lstm_data/Word2vec_model.pkl', unicode_errors='ignore')", still same error.