Yoctol / seq2vec

Transform sequence of words into a fix-length representation vector
GNU General Public License v3.0
68 stars 8 forks source link

can't load word2vec model when running example code #33

Open estathop opened 6 years ago

estathop commented 6 years ago

I am trying to execute the LSTM to LSTM auto-encoder with word embedding (RNN to RNN architecture) example. I have already trained my own word2vec model via gensim and saved it with the command model.save('/home/estathop/Documents/word2vecmodel/w2v1model') #save model when trying to use the

# load Gensim word2vec from word2vec_model_path
word2vec = GensimWord2vec('/home/estathop/Documents/word2vecmodel/w2v1model')

the following error occurs:

Traceback (most recent call last):

File "", line 5, in word2vec = GensimWord2vec('/home/estathop/Documents/word2vecmodel/w2v1model')

File "/home/estathop/anaconda2/envs/tensorflow/lib/python2.7/site-packages/seq2vec/word2vec/gensim_word2vec.py", line 9, in init model_path, binary=True

File "/home/estathop/anaconda2/envs/tensorflow/lib/python2.7/site-packages/gensim/models/keyedvectors.py", line 1120, in load_word2vec_format limit=limit, datatype=datatype)

File "/home/estathop/anaconda2/envs/tensorflow/lib/python2.7/site-packages/gensim/models/utils_any2vec.py", line 174, in _load_word2vec_format header = utils.to_unicode(fin.readline(), encoding=encoding)

File "/home/estathop/anaconda2/envs/tensorflow/lib/python2.7/site-packages/gensim/utils.py", line 359, in any2unicode return unicode(text, encoding, errors=errors)

File "/home/estathop/anaconda2/envs/tensorflow/lib/python2.7/encodings/utf_8.py", line 16, in decode return codecs.utf_8_decode(input, errors, True)

UnicodeDecodeError: 'utf8' codec can't decode byte 0x80 in position 0: invalid start byte

any ideas how to fix/bypass this ?

bhavikm commented 6 years ago

try saving word2vec in the original binary format:

model = load_model(modelpath=modelpath)  
model.wv.save_word2vec_format('w2v-original.bin', binary=True)
estathop commented 6 years ago

@bhavikm thanks, I bypassed the problem below but now another error occurs, when trying to execute the next block from the example, this error shows up:

transformer = Seq2VecR2RWord(
      word2vec_model=word2vec,
      max_length=20,
      latent_size=300,
      encoding_size=300,
      learning_rate=0.05
)

Traceback (most recent call last):

File "", line 6, in learning_rate=0.05

File "/home/estathop/anaconda2/envs/tensorflow/lib/python2.7/site-packages/seq2vec/model/seq2vec_R2R_word.py", line 55, in init learning_rate=learning_rate

File "/home/estathop/anaconda2/envs/tensorflow/lib/python2.7/site-packages/seq2vec/model/seq2vec_base.py", line 68, in init self.model, self.encoder = self.create_model()

File "/home/estathop/anaconda2/envs/tensorflow/lib/python2.7/site-packages/seq2vec/model/seq2vec_R2R_word.py", line 87, in create_model dense_dropout=0.

File "/home/estathop/anaconda2/envs/tensorflow/lib/python2.7/site-packages/yklz/recurrent/rnn_cell.py", line 22, in init **kwargs

File "/home/estathop/anaconda2/envs/tensorflow/lib/python2.7/site-packages/keras/engine/topology.py", line 262, in init self.stateful = False

AttributeError: can't set attribute

SoluMilken commented 6 years ago

Please use python 3.5 or above.