Open liwzhi opened 6 years ago
of cause the vector should be trained using the proper codec, it seems the model is trained in other coding environment. Can you check that.
I have come across the same error, anybody help? Thank you ~
I came across the same error as well. I changed:
word_vectors = KeyedVectors.load_word2vec_format(path, binary=True)
into
word_vectors = KeyedVectors.load(path)
It turns out that load_word2vec_format
is used when we're trying to load word vectors that are trained using the original implementation of word2vec (in C). Since these pre-trained word vectors are trained using Python (gensim), we can use load
instead.
@galuhsahid Thank you so much, it works now. : )
I have tried to read the files as you pointed, but I got the next error:
File "C:\ProgramData\Anaconda2\lib\site-packages\gensim\models\base_any2vec.py", line 380, in syn1neg
self.trainables.syn1neg = value
AttributeError: 'Word2Vec' object has no attribute 'trainables'
:(
Same error as @anavaldi . Any solution?
I solve this error by executing on my own word embeddings with the .sh file.
I have come across the same error. I changed gensim.models.KeyedVectors.load_word2vec_format()
into gensim.models.Word2Vec.load()
.Then it works
@hinamu it works, Thanks
@anavaldi
I solve this error by executing on my own word embeddings with the .sh file.
What do you mean?
I have tried to read the files as you pointed, but I got the next error:
File "C:\ProgramData\Anaconda2\lib\site-packages\gensim\models\base_any2vec.py", line 380, in syn1neg self.trainables.syn1neg = value AttributeError: 'Word2Vec' object has no attribute 'trainables'
:(
I solved this issue by degrading my gensim version from 3.6 to 3.0
UnpicklingError Traceback (most recent call last)
@kusumlata123 even i am getting that Unpickling Error
I am also getting the unpickling error... Any ideas? My code is:
chinese_model = gensim.models.Word2Vec.load(os.path.join(desktop, 'cc.zh.300.bin.gz'))
I also tried to save the text file and load it via the function provided by the fasttext official site. I first change the file extension from gz
to txt
and use the following functions:
import io
def load_vectors(fname):
fin = io.open(fname, 'r', encoding='utf-8', newline='\n', errors='ignore')
n, d = map(int, fin.readline().split())
data = {}
for line in fin:
tokens = line.rstrip().split(' ')
data[tokens[0]] = map(float, tokens[1:])
return data
model = load_vectors(os.path.join(desktop, 'cc.zh.300.vec.txt'))
However, I got the following errors:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-4-d67f52bde947> in <module>
----> 1 model = load_vectors(os.path.join(desktop, 'cc.zh.300.vec.txt'))
<ipython-input-3-0f69b5ce62b8> in load_vectors(fname)
1 def load_vectors(fname):
2 fin = io.open(fname, 'r', encoding='utf-8', newline='\n', errors='ignore')
----> 3 n, d = map(int, fin.readline().split())
4 data = {}
5 for line in fin:
ValueError: invalid literal for int() with base 10: '\x08\x08p[\x00\x03cc.zh.300.vec\x00\\ͮfMr7?W3ۀ0|Szдl\x14I\x132'
I tried the above solution but I am getting error as: UnpicklingError: invalid load key, '\x1f' My code: from gensim import models
word2vec_path = 'GoogleNews-vectors-negative300.bin.gz.2' word2vec = models.KeyedVectors.load(word2vec_path)
I came across the same error as well. I changed:
word_vectors = KeyedVectors.load_word2vec_format(path, binary=True)
into
word_vectors = KeyedVectors.load(path)
It turns out that
load_word2vec_format
is used when we're trying to load word vectors that are trained using the original implementation of word2vec (in C). Since these pre-trained word vectors are trained using Python (gensim), we can useload
instead.
When I tried this , I am getting : UnpicklingError: unpickling stack underflow
I came across the same error as well. I changed:
word_vectors = KeyedVectors.load_word2vec_format(path, binary=True)
intoword_vectors = KeyedVectors.load(path)
It turns out thatload_word2vec_format
is used when we're trying to load word vectors that are trained using the original implementation of word2vec (in C). Since these pre-trained word vectors are trained using Python (gensim), we can useload
instead.When I tried this , I am getting :
UnpicklingError: unpickling stack underflow
For Korean language, i got this error: 'AttributeError: Can't get attribute 'Vocab' on <module 'gensim.models.word2vec' from 'C:\Users\ductr\Python\lib\site-packages\gensim\models\word2vec.py'>' Would you mind letting me know what the error is?
I tried the above solution but I am getting error as: UnpicklingError: invalid load key, '\x1f' My code: from gensim import models
word2vec_path = 'GoogleNews-vectors-negative300.bin.gz.2' word2vec = models.KeyedVectors.load(word2vec_path)
I get the same error after using:
from gensim.models import Word2Vec
from gensim.models.keyedvectors import KeyedVectors
model = Word2Vec.load(model_path)
What am I doing wrong?
Hi,
I am trying to load Chinese pretrained word2vec, word_vectors = KeyedVectors.load_word2vec_format(path, binary=True) # C binary format
it throws this error.