Open parap1uie-s opened 4 years ago
If the model still loads and works as expected in Python 2.7, you might be able to modify the .vocab
dict there, to use true unicode strings as keys, then re-save the model for better results in Python 3.x.
Alternatively, a full recipe for both creating (in Python 2.7) a new tiny model with at least one problem word, showing that it works under Py2.7, saving the model and loading in Py3.x, and showing the problem there, might help generate other ideas for patching the model. (WIthout seeing representative code for creating your word2vec.emb
or the file itself, it's not clear what might have happened to cause the problem.)
Problem description
A gensim model was trained under Python 2.7 with a chinese dataset.
However, now we are using Python3.6, and we got some broken strings in .vocab.keys() as title.
Any helpful steps to convert a model trained under Python2.7 to compatible with Python3.6?
Thanks in advance.
Steps/code/corpus to reproduce
Versions