Closed parvathysarat closed 7 years ago
Hey! I think the first problem is quite simple: Since we've only recently added more models to spaCy v2.0 alpha, the shortcut es
doesn't yet point to the correct model – e.g. it's trying to load the es_core_web_md
model, whereas the new Spanish model is a sm
model. (This will definitely be fixed for the stable release, when we know which models are going to be available.)
In the meantime, you can simply download the model explicitly, e.g.:
spacy download es_core_web_sm
You can find the exact download command in the alpha models directory, in the right sidebar next to each model listing.
About the second error: Hmm, this looks like the server timed out, so this might indeed have something to do with your connection. Are you able to download the model archive file directly via your browser using the link: https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.0.0a7/en_core_web_sm-2.0.0a7.tar.gz The model is only around 35 MB. Once you've downloaded the file, you can just install the model from the local path, for example:
pip install /path/to/downloads/en_core_web_sm-2.0.0a7.tar.gz
# link model to the shortcut 'en' – normally, this is performed by "spacy download",
# but since you're installing it manually, you need to do this yourself if you want to
# load the model as "en"
spacy link en_core_web_sm en
The python -m spacy.en.download all
has been deprecated since v1.7.0 (and only points to the old models for spaCy v1.6.0 anyways). So this definitely won't work. (It should say this in the error message though, so there might be a bug here – will check and fix this if necessary.)
Hey, thanks a ton replying. I was able to install the English model using $ python -m spacy download en
which after downloading gave me the message You can now load the model via spacy.load('en')
Using IPython,
import spacy
nlp=spacy.load('en')
[AttributeError Traceback (most recent call last)
<ipython-input-5-a32b6d2b36d8> in <module>()
----> 1 nlp=spacy.load('en')
C:\Users\PARVATHY SARAT\Anaconda2\lib\site-packages\spacy\__init__.pyc in load(n
ame, **overrides)
13 from .deprecated import resolve_load_name
14 name = resolve_load_name(name, **overrides)
---> 15 return util.load_model(name, **overrides)
16
17
C:\Users\PARVATHY SARAT\Anaconda2\lib\site-packages\spacy\util.pyc in load_model
(name, **overrides)
102 if isinstance(name, basestring_):
103 if name in set([d.name for d in data_path.iterdir()]): # in data
dir / shortcut
--> 104 return load_model_from_link(name, **overrides)
105 if is_package(name): # installed as package
106 return load_model_from_package(name, **overrides)
C:\Users\PARVATHY SARAT\Anaconda2\lib\site-packages\spacy\util.pyc in load_model
_from_link(name, **overrides)
121 "Cant' load '%s'. If you're using a shortcut link, make sure
it "
122 "points to a valid model package (not just a data directory)
." % name)
--> 123 return cls.load(**overrides)
124
125
C:\Users\PARVATHY SARAT\Anaconda2\lib\site-packages\spacy\data\en\__init__.pyc i
n load(**overrides)
10
11 def load(**overrides):
---> 12 return load_model_from_init_py(__file__, **overrides)
C:\Users\PARVATHY SARAT\Anaconda2\lib\site-packages\spacy\util.pyc in load_model
_from_init_py(init_file, **overrides)
165 if not model_path.exists():
166 raise ValueError("Can't find model directory: %s" % path2str(dat
a_path))
--> 167 return load_model_from_path(data_path, meta, **overrides)
168
169
C:\Users\PARVATHY SARAT\Anaconda2\lib\site-packages\spacy\util.pyc in load_model
_from_path(model_path, meta, **overrides)
148 component = nlp.create_pipe(name, config=config)
149 nlp.add_pipe(component, name=name)
--> 150 return nlp.from_disk(model_path)
151
152
C:\Users\PARVATHY SARAT\Anaconda2\lib\site-packages\spacy\language.pyc in from_d
isk(self, path, disable)
571 if not (path / 'vocab').exists():
572 exclude['vocab'] = True
--> 573 util.from_disk(path, deserializers, exclude)
574 return self
575
C:\Users\PARVATHY SARAT\Anaconda2\lib\site-packages\spacy\util.pyc in from_disk(
path, readers, exclude)
495 for key, reader in readers.items():
496 if key not in exclude:
--> 497 reader(path / key)
498 return path
499
C:\Users\PARVATHY SARAT\Anaconda2\lib\site-packages\spacy\language.pyc in <lambd
a>(p)
558 path = util.ensure_path(path)
559 deserializers = OrderedDict((
--> 560 ('vocab', lambda p: self.vocab.from_disk(p)),
561 ('tokenizer', lambda p: self.tokenizer.from_disk(p, vocab=Fa
lse)),
562 ('meta.json', lambda p: p.open('w').write(json_dumps(self.me
ta)))
vocab.pyx in spacy.vocab.Vocab.from_disk()
vectors.pyx in spacy.vectors.Vectors.from_disk()
C:\Users\PARVATHY SARAT\Anaconda2\lib\site-packages\spacy\util.pyc in from_disk(
path, readers, exclude)
495 for key, reader in readers.items():
496 if key not in exclude:
--> 497 reader(path / key)
498 return path
499
vectors.pyx in spacy.vectors.Vectors.from_disk.load_keys()
C:\Users\PARVATHY SARAT\Anaconda2\lib\site-packages\numpy\lib\npyio.pyc in load(
file, mmap_mode, allow_pickle, fix_imports, encoding)
389 _ZIP_PREFIX = asbytes('PK\x03\x04')
390 N = len(format.MAGIC_PREFIX)
--> 391 magic = fid.read(N)
392 fid.seek(-N, 1) # back-up
393 if magic.startswith(_ZIP_PREFIX):
AttributeError: 'WindowsPath' object has no attribute 'read']
I have the en and es models downloaded in my working directory, what does this error message mean? Thansk again!
Ah, sorry you're having so many problems!
I have the en and es models downloaded in my working directory, what does this error message mean?
Interesting, this looks like an issue during deserialization, i.e. when the binary data of the model is loaded. Down the line in numpy, it seems to be unable to load from a WindowsPath
... I'm surprised this hasn't come up before!
Just had a look at how this is handled in the numpy source and it doesn't detect the WindowsPath
as a Path
and thus doesn't open the file. So in spaCy, I think we should be able to prevent this problem by always passing in a string for path / key
to the reader
. (Sorry for the dump of random specifics, just writing this down for the bugfix. I've already made the change on develop
and it will be included in the next alpha release).
There's not really and easy solution or workaround for you at the moment – except for switching to Python 3.6, which shouldn't have this problem.
Okay, thanks! Uninstalling and installing all of it again now. Btw, do you think not properly installing the Microsoft Visual C++ 14.0 could have been the issue? I got an error to install it initially, which I did, and then I was able to install spacy. But I think I may have installed a leaner/improper version of it the first time, would that cause the error : WindowsPath' object has no attribute 'read'?
In this case, I'd say it's unlikely – but then again, with Windows compilers, you never really know. Unfortunately, this stuff is still pretty tricky, and probably the number one source of issues for Windows users. (So it's good to keep this in mind in case you end up having more problems later on.)
Providing spaCy on conda has made a big difference, though – so once spaCy v2.0.0 stable is released, you'll also be able to download and install it straight from there.
Guess I'll have to see what I can do until it's out for downloads. After switching to Python 3.6, I was able to download spacy using Visual C++ command prompt. But now back to error for downloading English model - command 'cl.exe' failed: No such file or directory
I'm sure there are others who have been successful doing this? Too much energy and hopes on spacy,need to solve this somehow.
I just did a quick search for that error and found this thread on StackOverflow: https://stackoverflow.com/questions/41724445/python-pip-on-windows-command-cl-exe-failed
It has some solutions and an accepted answer, so maybe this is helpful? The problem seems common enough, so there are also several other threads on (likely) the same issue.
I tried and retried the solutions suggested for cl.exe issue, but other than new/old errors cropping up I couldn't progress. Hence I've switched to Ubuntu! I could import spacy (Python 2.7) until I downloaded and installed (the way it was mentioned above by you) the English model. Now the error seems to be
import spacy
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/parvathy/.local/lib/python2.7/site-packages/spacy/__init__.py", line 10, in <module>
from . import en, de, zh, es, it, hu, fr, pt, nl, sv, fi, bn, he, nb, ja
File "/home/parvathy/.local/lib/python2.7/site-packages/spacy/en/__init__.py", line 4, in <module>
from ..language import Language
File "/home/parvathy/.local/lib/python2.7/site-packages/spacy/language.py", line 14, in <module>
from .pipeline import DependencyParser, EntityRecognizer
File "spacy/pipeline.pyx", line 1, in init spacy.pipeline (spacy/pipeline.cpp:16536)
File ".env/lib/python2.7/site-packages/thinc/extra/search.pxd", line 72, in init spacy.syntax.beam_parser (spacy/syntax/beam_parser.cpp:20037)
ValueError: thinc.extra.search.MaxViolation has the wrong size, try recompiling
I tried sudo pip install thinc==6.8.1
followed by installing the model again but the error persists. Any thoughts? Thanks in adv, always.
The correct Thinc version for spaCy nightly 2.0.0a17
is definitely 6.9.0
– so if this is the combination you have and you've installed everything from scratch in a clean environment on Ubuntu, and used the latest version of the model, this should all be fine. Sorry it still isn't working – after all the stress so far, you definitely deserve better!
To help us debug, could you post the result of spacy info --markdown
? And just to be safe, when you run spacy validate
on the command line, does it show all models as green and up to date?
It worked after I redid the whole thing in a virtual environment! Thanks a lot for the help :)
Yessssss! 🎉🙏
I have some related issues, I download the 'en' model, $ python3 -m spacy download en, yield Linking sucessful /home/abc/miniconda3/lib/python3.6/site-packages/en_core_web_sm --> /home/abc/miniconda3/lib/python3.6/site-packages/spacy/data/en
You can now load the model via spacy.load('en')
However, when I use it $nlp=spacy.load('en') I still get the information "OSError: Can't find model 'en'"
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.
I installed spacy using pip and have been trying to download the language models. However
$ python -m spacy download es
yieldsWhile trying to download the English model,
$ pip install https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-1.2.0/en_core_web_sm-1.2.0.tar.gz
as well as$ python -m spacy download en
yield errorsCollecting https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.0.0a7/en_core_web_sm-2.0.0a7.tar.gz Downloading https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.0.0a7/en_core_web_sm-2.0.0a7.tar.gz (36.4MB) Exception: Traceback (most recent call last): File "C:\Users\PARVATHY SARAT\Anaconda2\lib\site-packages\pip\basecommand.py", line 215, in main status = self.run(options, args) File "C:\Users\PARVATHY SARAT\Anaconda2\lib\site-packages\pip\commands\install.py", line 324, in run requirement_set.prepare_files(finder) File "C:\Users\PARVATHY SARAT\Anaconda2\lib\site-packages\pip\req\req_set.py", line 380, in prepare_files ignore_dependencies=self.ignore_dependencies)) File "C:\Users\PARVATHY SARAT\Anaconda2\lib\site-packages\pip\req\req_set.py", line 620, in _prepare_file session=self.session, hashes=hashes) File "C:\Users\PARVATHY SARAT\Anaconda2\lib\site-packages\pip\download.py", line 821, in unpack_url hashes=hashes File "C:\Users\PARVATHY SARAT\Anaconda2\lib\site-packages\pip\download.py", line 659, in unpack_http_url hashes) File "C:\Users\PARVATHY SARAT\Anaconda2\lib\site-packages\pip\download.py", line 882, in _download_http_url _download_url(resp, link, content_file, hashes) File "C:\Users\PARVATHY SARAT\Anaconda2\lib\site-packages\pip\download.py", line 605, in _download_url consume(downloaded_chunks) File "C:\Users\PARVATHY SARAT\Anaconda2\lib\site-packages\pip\utils__init.py", line 852, in consume deque(iterator, maxlen=0) File "C:\Users\PARVATHY SARAT\Anaconda2\lib\site-packages\pip\download.py", line 571, in written_chunks for chunk in chunks: File "C:\Users\PARVATHY SARAT\Anaconda2\lib\site-packages\pip\utils\ui.py", line 139, in iter for x in it: File "C:\Users\PARVATHY SARAT\Anaconda2\lib\site-packages\pip\download.py", line 560, in resp_read decode_content=False): File "C:\Users\PARVATHY SARAT\Anaconda2\lib\site-packages\pip_vendor\requests\packages\urllib3\response.py", line 357, in stream data = self.read(amt=amt, decode_content=decode_content) File "C:\Users\PARVATHY SARAT\Anaconda2\lib\site-packages\pip_vendor\requests\packages\urllib3\response.py", line 324, in read flush_decoder = True File "C:\Users\PARVATHY SARAT\Anaconda2\lib\contextlib.py", line 35, in exit__ self.gen.throw(type, value, traceback) File "C:\Users\PARVATHY SARAT\Anaconda2\lib\site-packages\pip_vendor\requests\packages\urllib3\response.py", line 246, in _error_catcher raise ReadTimeoutError(self._pool, None, 'Read timed out.') ReadTimeoutError: HTTPSConnectionPool(host='github-production-release-asset-2e65be.s3.amazonaws.com', port=443): Read timed out.
$ python -m spacy.en.download all
C:\Users\PARVATHY SARAT\Anaconda2\python.exe: cannot import name fix_glove_vectors_loadingWhat do these issues mean and how do I rectify them? I tried downloading both using office Wifi and home Wifi so not so sure if it's a connection error. OS: Windows 10 Thanks a lot in advance.