piskvorky / gensim-data

Data repository for pretrained NLP models and NLP corpora.
https://rare-technologies.com/new-api-for-pretrained-nlp-models-and-datasets-in-gensim/
GNU Lesser General Public License v2.1
980 stars 131 forks source link

module 'word2vec-google-news-300' has no attribute 'load_data' #37

Closed jarviszhb closed 5 years ago

jarviszhb commented 5 years ago

I tried to use gensim.downloader to download 'word2vec-google-news-300', but my network isn't very reliable, so I downloaded word2vec-google-news-300.gz and init.py from github and put them into ~/gensim-data/word2vec-google-news-300/. But when I use api.load("word2vec-google-news-300")to load this model, I recieved error like this:

AttributeError: module 'word2vec-google-news-300' has no attribute 'load_data'

import gensim.downloader as api
model = api.load("word2vec-google-news-300")
piskvorky commented 5 years ago

Post the full stack trace please.

Plus your versions of Python, gensim and OS.

aniruddhakal commented 4 years ago

what was the resolution? I'm facing this same issue.

OS: Ubuntu 19.10 gensim version: 3.8.1

Stack Trace:

AttributeError                            Traceback (most recent call last)
<ipython-input-4-f5399884feeb> in <module>
      1 import gensim.downloader as api
      2 
----> 3 api.load('glove-twitter-100')

/usr/local/lib/python3.7/dist-packages/gensim/downloader.py in load(name, return_path)
    500         sys.path.insert(0, BASE_DIR)
    501         module = __import__(name)
--> 502         return module.load_data()
    503 
    504 

AttributeError: module 'glove-twitter-100' has no attribute 'load_data'
piskvorky commented 4 years ago

I cannot replicate, on the same Ubuntu 19.10 and gensim 3.8.1.

Does your ~/gensim-data directory look like this?

(tst) [radim@h3:~/gensim-data]$ ls -l
total 28
drwx------ 2 radim radim  4096 Mar 16 10:32 glove-twitter-100
-rw-r--r-- 1 radim radim 21626 Mar 16 10:34 information.json

(tst) [radim@h3:~/gensim-data]$ ls -l glove-twitter-100/
total 396432
-rw-r--r-- 1 radim radim       256 Mar 16 10:32 __init__.py
-rw-r--r-- 1 radim radim       604 Mar 16 10:32 __init__.pyc
-rw-r--r-- 1 radim radim 405932991 Mar 16 10:32 glove-twitter-100.gz
aniruddhakal commented 4 years ago

Hi @piskvorky

I fixed the issue, I had to elevate the access permission for .gz archive. Thanks for the clue.

gensim_fix

piskvorky commented 4 years ago

That's weird, that shouldn't be necessary. My snippet also had just -rw-r--r--. I guess you're loading the file in Python under a different user, not aniruddha, right?

Either way, we should probably fail with a more useful error message if opening the file fails. The current error is too cryptic.

Can you open a PR for that @aniruddhakal? Thanks.

aniruddhakal commented 4 years ago

@piskvorky No, I'm loading python program using the same user account.

Are you suggesting that I make changes and raise PR? I'll be happy to do that.

piskvorky commented 4 years ago

Well, step one is finding out what's wrong. The fact that changing permissions to -rwxrwxrwx helped is a strong clue, but I still don't know why that's needed on your system.

Can you look deeper into the actual error, check that happens exactly when you leave permissions at -rw-r--r--?