coosto / dutch-word-embeddings

Dutch word embeddings, trained on a large collection of Dutch social media messages and news/blog/forum posts.
Other
44 stars 3 forks source link

Unpickling error when loading model #2

Closed aitorme closed 3 years ago

aitorme commented 3 years ago

I have a newbie problem here, I tried to load the model using gensim.models.Word2Vec.load("dutch-word-embeddings/model.bin") and I got the following error:


UnpicklingError Traceback (most recent call last) /var/folders/nt/9bk8jw9n28b1p_jlynbvy4lc0000gp/T/ipykernel_46367/2953225409.py in ----> 1 model = gensim.models.Word2Vec.load("dutch-word-embeddings/model.bin")

~/Documents/dutch_venv/lib/python3.9/site-packages/gensim/models/word2vec.py in load(cls, rethrow, *args, *kwargs) 1928 """ 1929 try: -> 1930 model = super(Word2Vec, cls).load(args, **kwargs) 1931 if not isinstance(model, Word2Vec): 1932 rethrow = True

~/Documents/dutch_venv/lib/python3.9/site-packages/gensim/utils.py in load(cls, fname, mmap) 483 compress, subname = SaveLoad._adapt_by_suffix(fname) 484 --> 485 obj = unpickle(fname) 486 obj._load_specials(fname, mmap, compress, subname) 487 obj.add_lifecycle_event("loaded", fname=fname)

~/Documents/dutch_venv/lib/python3.9/site-packages/gensim/utils.py in unpickle(fname) 1458 """ 1459 with open(fname, 'rb') as f: -> 1460 return _pickle.load(f, encoding='latin1') # needed because loading from S3 doesn't support readline() 1461 1462

UnpicklingError: unpickling stack underflow


I'm not sure if the problem comes from the model itself or from the function I'm using to load it (it's my first time using it).

Thanks in advance!

aitorme commented 3 years ago

I realized I'm using the wrong method to open the model, and I should be using gensim.models.KeyedVectors.load_word2vec_format("dutch-word-embeddings/model.bin", binary=True) instead.