chrisjbryant / errant

ERRor ANnotation Toolkit: Automatically extract and classify grammatical errors in parallel original and corrected sentences.
MIT License
436 stars 107 forks source link

spacy 1.9.0 #14

Closed borgr closed 4 years ago

borgr commented 4 years ago

Hi, The pip install doesn't work and also from file. The problem is the outdated spacy that doesn't work, pip can't build wheels for it and it throws errors when trying to import errant). manually updating spacy seem to solve it. (I am yet to use errant deeply with this so I might find it did not) python 3.7.3 spacy 2.2.4 gcc 8.3 if relevant

borgr commented 4 years ago

In ubuntu and python 3.6.9 spacy failed also, did not try building from source and updating, might have worked there too.

chrisjbryant commented 4 years ago

Heya,

What error messages did you get? I personally run it in ubuntu and python 3.6.9, so know that definitely works.

You might need to update python wheel and maybe get the dev dependencies for python too. If you get a `bdist_wheel error', definitely try updating that first: #12 .

chrisjbryant commented 4 years ago

Something else to try: `pip install errant --no-cache-dir'

borgr commented 4 years ago

pip install python-dev-tools - did not solve sudo apt-get install python3-dev (on the ubuntu where sudo is available) - did not solve no cache tried before asking and again now - did not solve

It seems the problems in ubuntu is different than the other Linux used by the cs. on cs: The same error occurs when just pip installing spacy 1.9.0 ( error: command 'x86_64-linux-gnu-gcc' failed with exit status 1) but, installing new spacy is not a problem. Is the old dependency a must?

In ubuntu, it manages to install, but something is weird in spacy, it can be imported but it is "empty" Those are all supposedly spacy problems, but it seems like spacy already solved those problems.

errors on ubuntu

(errant) leshem@leshem:~$ python3 -m spacy download en
/home/leshem/envs/errant/bin/python3: No module named spacy.__main__; 'spacy' is a package and cannot be directly executed
(errant) leshem@leshem:~$ python
Python 3.6.9 (default, Nov  7 2019, 10:44:02) 
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import spacy
>>> dir(spacy)
['__doc__', '__loader__', '__name__', '__package__', '__path__', '__spec__']
>>> spacy.load("en")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: module 'spacy' has no attribute 'load'
>>> 

Full error on the server. the pip install errant fails on spacy and most of the others (thinc cy)

  building 'spacy.strings' extension
  x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/cs/snapless/oabend/borgr/envs/errant/include/python3.7m -I/tmp/pip-install-_iwppskt/spacy/include -I/usr/include/python3.7m -I/cs/snapless/oabend/borgr/envs/errant/include/python3.7m -c spacy/strings.cpp -o build/temp.linux-x86_64-3.7/spacy/strings.o -O3 -Wno-strict-prototypes -Wno-unused-function -fopenmp
  cc1plus: warning: command line option ‘-Wno-strict-prototypes’ is valid for C/ObjC but not for C++
  spacy/strings.cpp: In function ‘void __Pyx_ExceptionSwap(PyObject**, PyObject**, PyObject**)’:
  spacy/strings.cpp:7228:24: error: ‘PyThreadState’ {aka ‘struct _ts’} has no member named ‘exc_type’; did you mean ‘curexc_type’?
       tmp_type = tstate->exc_type;
                          ^~~~~~~~
                          curexc_type
  spacy/strings.cpp:7229:25: error: ‘PyThreadState’ {aka ‘struct _ts’} has no member named ‘exc_value’; did you mean ‘curexc_value’?
       tmp_value = tstate->exc_value;
                           ^~~~~~~~~
                           curexc_value
  spacy/strings.cpp:7230:22: error: ‘PyThreadState’ {aka ‘struct _ts’} has no member named ‘exc_traceback’; did you mean ‘curexc_traceback’?
       tmp_tb = tstate->exc_traceback;
                        ^~~~~~~~~~~~~
                        curexc_traceback
  spacy/strings.cpp:7231:13: error: ‘PyThreadState’ {aka ‘struct _ts’} has no member named ‘exc_type’; did you mean ‘curexc_type’?
       tstate->exc_type = *type;
               ^~~~~~~~
               curexc_type
  spacy/strings.cpp:7232:13: error: ‘PyThreadState’ {aka ‘struct _ts’} has no member named ‘exc_value’; did you mean ‘curexc_value’?
       tstate->exc_value = *value;
               ^~~~~~~~~
               curexc_value
  spacy/strings.cpp:7233:13: error: ‘PyThreadState’ {aka ‘struct _ts’} has no member named ‘exc_traceback’; did you mean ‘curexc_traceback’?
       tstate->exc_traceback = *tb;
               ^~~~~~~~~~~~~
               curexc_traceback
  error: command 'x86_64-linux-gnu-gcc' failed with exit status 1
  ----------------------------------------
  ERROR: Failed building wheel for spacy
borgr commented 4 years ago

If it helps, our IT\system thinks the x86 problem is because spacy can't work on py 3.7+ (but changing the python version backward even with conda is a bit problematic on the cs computer we are still looking for a way to downgrade our python for this.) If that is indeed the problem (python 3.8 is already used) this is going to be a problem for others in the near future.

chrisjbryant commented 4 years ago

Yes - we tried installing on python 3.7 recently and ran into a similar problem: issue

Is the old dependency a must?

Sadly, yes and no. Errant does work with spacy 2, but is 4x slower and slightly less accurate. This is because spacy's neural models are a lot slower than the old linear models. I could do something about the accuracy loss, but can't do anything about the speed loss, so still prefer v1.9. I did ask the spacy authors about updating spacy with the linear models, but they said this would be something to include in spacy 3, so I'm not sure about a timescale: link

I've not been able to recreate your error on ubuntu, so I'm still not sure what's going on there. :/ Maybe I should just start upgrading to spacy 2 despite the speed loss...

borgr commented 4 years ago

Perhaps the faster errant can be kept available for ones who need it in a separate branch?

chrisjbryant commented 4 years ago

Yea, I will start looking into it.

In the meantime, a friend just managed to install errant on a clean install of Ubuntu 18.04 with Python 3.6.9 without any problems. He ran:

apt-get install python3-pip
apt-get install python3-venv
pip3 install -U wheel
python3 -m venv errant_env
source errant_env/bin/activate
pip3 install errant
python3 -m spacy download en

If that didn't work, you can also try installing some of the packages listed in the answer by r-wheeler here.

chrisjbryant commented 4 years ago

I just update ERRANT to work with spacy >=2.2 and Python 3.7. It's slower, as predicted, but I tweaked a couple of rules so performance should be about the same!