Closed randomgambit closed 8 months ago
Can you provide a reproducible example? (We do not have access to your model object.)
Hi @kevinushey ! its been a while! you can reproduce the error by downloading and unpacking the model here https://github.com/explosion/spacy-models/releases//tag/en_core_web_lg-2.0.0 . then you just give the path as I did here
nlp = spacy.load('/otherdrive/model/en_core_web_lg-2.0.0/en_core_web_lg/en_core_web_lg-2.0.0')
Thanks! This also reproduces the issue, it seems:
library(reticulate)
spacy <- import("spacy")
spacy$load('en')
EDIT: scratch that; that error went away after downloading the language file with the instructions at https://spacy.io/usage/models; that is, I ran:
python3 -m spacy download en
and then was able to load that.
@kevinushey be careful that a model must be either downloaded with conda
or loaded with spacy.load()
as I did. https://spacy.io/usage/models#download-manual
yes that is the issue here. I cannot use the spacy download because I am behind a firewall. Also, I plan to use this with sparklyr
so it makes totally sense to load my model from a given network drive. This works in spyder
but doesnt in R with reticulate. There has to be some rational explanation here...
I was able to successfully load that model as well, so my only guess is that this:
python3 -m spacy download en
or some variant of that needs to be run in your environment. FWIW I saw this output on install:
kevin@zordon:~/r/pkg/reticulate [feature/python-virtualenv-absolute-path]
$ python3 -m spacy download en
Collecting en_core_web_sm==2.0.0 from https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.0.0/en_core_web_sm-2.0.0.tar.gz#egg=en_core_web_sm==2.0.0
Downloading https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.0.0/en_core_web_sm-2.0.0.tar.gz (37.4MB)
100% |████████████████████████████████| 37.4MB 7.7MB/s
Installing collected packages: en-core-web-sm
Running setup.py install for en-core-web-sm ... done
Successfully installed en-core-web-sm-2.0.0
Linking successful
/usr/local/lib/python3.7/site-packages/en_core_web_sm -->
/Users/kevin/Library/Python/3.7/lib/python/site-packages/spacy/data/en
You can now load the model via spacy.load('en')
Is it possible that scapy isn't seen the symlink it needs to load this module?
I see, but I cannot use the spacy download and nobody behind a firewall can. So the only option is to load the model manually with spacy_load()
I am not quite sure to understand how linking is done, but were you actually able to run something like
py_run_string("import spacy")
py_run_string("nlp = spacy.load('/otherdrive/model/en_core_web_lg-2.0.0/en_core_web_lg/en_core_web_lg-2.0.0')")
perhaps @ines has an idea? Thanks!!!
Yes, that code runs fine for me in my environment.
damn... what do you see when you run reticulate::py_config()
? I see multiple versions of python under python versions found:
could that be the issue?
ha... actually running > py_run_string("import pandas as pd")
gives me some chilling message ImportError: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.21' not found
so maybe this is part of a broader issue of how to make R interact well with Python?
and why on earth is this looking at /usr/lib
?? My python exe is in /mydrive/anaconda/bin/python
maybe THAT is the issue?
@randomgambit Sorry, only saw this discussion now! From your posts in explosion/spaCy#2982, it definitely sounds like this is unrelated to the models, so I'd recommend leaving them out of the test cases here, since it just introduces unnecessary complexity. Here's a simpler example, which only imports from regular spaCy module:
from spacy.lang.en import English
In your environment, this resulted in an ImportError
of DependencyParser
. DependencyParser
is a Cython module and errors like this usually indicate that there's something wrong with your compiler. The other error you shared also confirms that suspicion:
ImportError: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.21' not found
If you search for that error message, you'll find all kinds of threads on this problem with various solutions – but it looks like they all come down to upgrading libstd
. I'm pretty confident that once you've resolve that problem, spaCy will also work as expected 🙂
Hello the
reticulate
team. I am escalating this issue with you because I was unable to solve it otherwise.Please see here https://github.com/explosion/spaCy/issues/2982
Here is the issue: I am perfectly able to use the well-known Spacy package in python and, in particular, to load my custom model in
spyder
Now doing the same exact thing in R (using
reticulate
) triggers an error whenspacy
apparently cannot load theen
model (even though I pointed toward a specific folder)After discussin a bit with @ines ,
What do you think? THanks!!