shashank-bhatt-07 / Natural-Language-Generation-using-LSTM-Keras

Natural Language Generation using LSTM-Keras
6 stars 6 forks source link

Word2Vec Google archive #1

Open cjsweeney opened 6 years ago

cjsweeney commented 6 years ago

Hi, The link to the Google archive for its word2vec.bin is no longer valid. How would one go about getting around this?

shashank-bhatt-07 commented 6 years ago

Hi,

There is a link for downloading google's word2vec pre-trained model...

https://drive.google.com/file/d/0B7XkCwpI5KDYNlNUTTlSS21pQmM/edit?usp=sharing You have to change the name of model to GoogleNews-vectors-negative300.bin.

Kcrong commented 5 years ago

@shashankbhatt After I downloaded GoogleNews-vectors-negative300.bin with above link, the error was raised.

I tried to run

model = gensim.models.Word2Vec.load('GoogleNews-vectors-negative300.bin')

And I got

---------------------------------------------------------------------------
UnpicklingError                           Traceback (most recent call last)
<ipython-input-13-9320157d22ce> in <module>
----> 1 model = gensim.models.Word2Vec.load('GoogleNews-vectors-negative300.bin')

~/env3/lib/python3.7/site-packages/gensim/models/word2vec.py in load(cls, *args, **kwargs)
   1310         """
   1311         try:
-> 1312             model = super(Word2Vec, cls).load(*args, **kwargs)
   1313 
   1314             # for backward compatibility for `max_final_vocab` feature

~/env3/lib/python3.7/site-packages/gensim/models/base_any2vec.py in load(cls, *args, **kwargs)
   1242 
   1243         """
-> 1244         model = super(BaseWordEmbeddingsModel, cls).load(*args, **kwargs)
   1245         if not hasattr(model, 'ns_exponent'):
   1246             model.ns_exponent = 0.75

~/env3/lib/python3.7/site-packages/gensim/models/base_any2vec.py in load(cls, fname_or_handle, **kwargs)
    601 
    602         """
--> 603         return super(BaseAny2VecModel, cls).load(fname_or_handle, **kwargs)
    604 
    605     def save(self, fname_or_handle, **kwargs):

~/env3/lib/python3.7/site-packages/gensim/utils.py in load(cls, fname, mmap)
    420         compress, subname = SaveLoad._adapt_by_suffix(fname)
    421 
--> 422         obj = unpickle(fname)
    423         obj._load_specials(fname, mmap, compress, subname)
    424         logger.info("loaded %s", fname)

~/env3/lib/python3.7/site-packages/gensim/utils.py in unpickle(fname)
   1359         # Because of loading from S3 load can't be used (missing readline in smart_open)
   1360         if sys.version_info > (3, 0):
-> 1361             return _pickle.load(f, encoding='latin1')
   1362         else:
   1363             return _pickle.loads(f.read())

UnpicklingError: invalid load key, '\x1f'.