gabrielStanovsky / unified-factuality

Code, data and models for the paper "Integrating Deep Linguistic Features in Factuality Prediction over Unified Datasets" (Stanovsky, Eckle-Kohler, Puzikov, Dagan and Gurevych ACL 2017)
MIT License
7 stars 0 forks source link

docopt missing in conversion script #5

Closed gabrielStanovsky closed 7 years ago

gabrielStanovsky commented 7 years ago

@judithek testing on Ubuntu 14.04.5 LTS Release: 14.04 Codename: trusty Python 2.7.6 (default, Jun 22 2015, 17:58:13) [GCC 4.8.2] on linux2

I get an import error when running convert_corpora.sh:

./scripts/convert_corpora.sh Converting UW.. Traceback (most recent call last): File "convert_uw_to_conll.py", line 6, in from docopt import docopt ImportError: No module named docopt

gabrielStanovsky commented 7 years ago

Fixed this with recent commits. @judithek, can you try again, please?

judithek commented 7 years ago

the script must be executed with sudo -E as well, otherwise I get the following error:

copying build/lib.linux-x86_64-2.7/docopt.py -> /usr/local/lib/python2.7/dist-packages

error: [Errno 13] Permission denied: '/usr/local/lib/python2.7/dist-packages/docopt.py'

judithek commented 7 years ago

for a while, the script seems to run ok, but I get several warnings, e.g.:

from spacy: #warning "Using deprecated NumPy API, disable it by " \

Could not find .egg-info directory in install record for docopt (from -r ./scripts/corpus_requirements.txt (line 1))

Could not find .egg-info directory in install record for spacy==1.8.2 (from -r ./scripts/corpus_requirements.txt (line 2))

Could not find .egg-info directory in install record for bs4==0.0.1 (from -r ./scripts/corpus_requirements.txt (line 3))

gabrielStanovsky commented 7 years ago

are these warnings, or does it stop the script?

judithek commented 7 years ago

then the following error occurs - this might be due to my local setup (xslt-config: not found )

In file included from src/lxml/lxml.etree.c:515:0:

src/lxml/includes/etree_defs.h:14:31: fatal error: libxml/xmlversion.h: No such file or directory

 #include "libxml/xmlversion.h"

                               ^

compilation terminated.

Compile failed: command 'x86_64-linux-gnu-gcc' failed with exit status 1

creating tmp

cc -I/usr/include/libxml2 -c /tmp/xmlXPathInitKXwSf7.c -o tmp/xmlXPathInitKXwSf7.o

/tmp/xmlXPathInitKXwSf7.c:1:26: fatal error: libxml/xpath.h: No such file or directory

 #include "libxml/xpath.h"

                          ^

compilation terminated.

*********************************************************************************

Could not find function xmlCheckVersion in library libxml2. Is libxml2 installed?

*********************************************************************************

error: command 'x86_64-linux-gnu-gcc' failed with exit status 1
judithek commented 7 years ago

not sure about the status of the warnings yet - will try to fix my setup first

judithek commented 7 years ago

after installing libxml2-dev and libxslt-dev via sudo apt-get install libxml2-dev libxslt-dev python-dev

I am struggling with this error now:

x86_64-linux-gnu-gcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -fPIC -I/usr/include/libxml2 -Isrc/lxml/includes -I/usr/include/python2.7 -c src/lxml/lxml.etree.c -o build/temp.linux-x86_64-2.7/src/lxml/lxml.etree.o -w

x86_64-linux-gnu-gcc -pthread -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-Bsymbolic-functions -Wl,-z,relro -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -D_FORTIFY_SOURCE=2 -g -fstack-protector --param=ssp-buffer-size=4 -Wformat -Werror=format-security build/temp.linux-x86_64-2.7/src/lxml/lxml.etree.o -lxslt -lexslt -lxml2 -lrt -lz -lm -o build/lib.linux-x86_64-2.7/lxml/etree.so

/usr/bin/ld: cannot find -lz

collect2: error: ld returned 1 exit status

error: command 'x86_64-linux-gnu-gcc' failed with exit status 1
gabrielStanovsky commented 7 years ago

This is the result of the libxml install (sudo apt-get ..), or of our script? Can you please post the full log?

judithek commented 7 years ago

it seems to be a result of our script (?), see full output below;

sudo -E ./scripts/convert_corpora.sh

Requirement already satisfied (use --upgrade to upgrade): docopt in /usr/local/lib/python2.7/dist-packages (from -r ./scripts/corpus_requirements.txt (line 1)) Requirement already satisfied (use --upgrade to upgrade): spacy==1.8.2 in /usr/local/lib/python2.7/dist-packages (from -r ./scripts/corpus_requirements.txt (line 2)) Requirement already satisfied (use --upgrade to upgrade): bs4==0.0.1 in /usr/local/lib/python2.7/dist-packages (from -r ./scripts/corpus_requirements.txt (line 3)) Requirement already satisfied (use --upgrade to upgrade): Unidecode==0.4.20 in /usr/local/lib/python2.7/dist-packages (from -r ./scripts/corpus_requirements.txt (line 4)) Downloading/unpacking lxml==3.7.3 (from -r ./scripts/corpus_requirements.txt (line 5)) Downloading lxml-3.7.3.tar.gz (3.8MB): 3.8MB downloaded Running setup.py (path:/tmp/pip_build_root/lxml/setup.py) egg_info for package lxml Building lxml version 3.7.3. Building without Cython. Using build configuration of libxslt 1.1.28

warning: no previously-included files found matching '*.py'

Requirement already satisfied (use --upgrade to upgrade): numpy>=1.7 in /usr/local/lib/python2.7/dist-packages (from spacy==1.8.2->-r ./scripts/corpus_requirements.txt (line 2)) Downloading/unpacking murmurhash<0.27,>=0.26 (from spacy==1.8.2->-r ./scripts/corpus_requirements.txt (line 2)) Downloading murmurhash-0.26.4.tar.gz Running setup.py (path:/tmp/pip_build_root/murmurhash/setup.py) egg_info for package murmurhash

Requirement already satisfied (use --upgrade to upgrade): cymem<1.32,>=1.30 in /usr/local/lib/python2.7/dist-packages (from spacy==1.8.2->-r ./scripts/corpus_requirements.txt (line 2)) Downloading/unpacking preshed<2.0.0,>=1.0.0 (from spacy==1.8.2->-r ./scripts/corpus_requirements.txt (line 2)) Downloading preshed-1.0.0.tar.gz (89kB): 89kB downloaded Running setup.py (path:/tmp/pip_build_root/preshed/setup.py) egg_info for package preshed

Downloading/unpacking thinc<6.6.0,>=6.5.0 (from spacy==1.8.2->-r ./scripts/corpus_requirements.txt (line 2)) Downloading thinc-6.5.2.tar.gz (926kB): 926kB downloaded Running setup.py (path:/tmp/pip_build_root/thinc/setup.py) egg_info for package thinc

warning: no files found matching 'buildbot.json'

Downloading/unpacking plac<1.0.0,>=0.9.6 (from spacy==1.8.2->-r ./scripts/corpus_requirements.txt (line 2)) Downloading plac-0.9.6-py2.py3-none-any.whl Requirement already satisfied (use --upgrade to upgrade): six in /usr/local/lib/python2.7/dist-packages (from spacy==1.8.2->-r ./scripts/corpus_requirements.txt (line 2)) Downloading/unpacking pathlib (from spacy==1.8.2->-r ./scripts/corpus_requirements.txt (line 2)) Downloading pathlib-1.0.1.tar.gz (49kB): 49kB downloaded Running setup.py (path:/tmp/pip_build_root/pathlib/setup.py) egg_info for package pathlib

Downloading/unpacking ujson>=1.35 (from spacy==1.8.2->-r ./scripts/corpus_requirements.txt (line 2)) Downloading ujson-1.35.tar.gz (192kB): 192kB downloaded Running setup.py (path:/tmp/pip_build_root/ujson/setup.py) egg_info for package ujson

Requirement already satisfied (use --upgrade to upgrade): dill<0.3,>=0.2 in /usr/local/lib/python2.7/dist-packages (from spacy==1.8.2->-r ./scripts/corpus_requirements.txt (line 2)) Downloading/unpacking requests<3.0.0,>=2.13.0 (from spacy==1.8.2->-r ./scripts/corpus_requirements.txt (line 2)) Downloading requests-2.14.2-py2.py3-none-any.whl (560kB): 560kB downloaded Downloading/unpacking regex==2017.4.5 (from spacy==1.8.2->-r ./scripts/corpus_requirements.txt (line 2)) Downloading regex-2017.04.05.tar.gz (601kB): 601kB downloaded Running setup.py (path:/tmp/pip_build_root/regex/setup.py) egg_info for package regex /usr/local/lib/python2.7/dist-packages/setuptools/dist.py:364: UserWarning: Normalizing '2017.04.05' to '2017.4.5' normalized_version,

Requirement already satisfied (use --upgrade to upgrade): ftfy<5.0.0,>=4.4.2 in /usr/local/lib/python2.7/dist-packages (from spacy==1.8.2->-r ./scripts/corpus_requirements.txt (line 2)) Requirement already satisfied (use --upgrade to upgrade): beautifulsoup4 in /usr/local/lib/python2.7/dist-packages (from bs4==0.0.1->-r ./scripts/corpus_requirements.txt (line 3)) Downloading/unpacking wrapt (from thinc<6.6.0,>=6.5.0->spacy==1.8.2->-r ./scripts/corpus_requirements.txt (line 2)) Downloading wrapt-1.10.10.tar.gz Running setup.py (path:/tmp/pip_build_root/wrapt/setup.py) egg_info for package wrapt

Downloading/unpacking tqdm<5.0.0,>=4.10.0 (from thinc<6.6.0,>=6.5.0->spacy==1.8.2->-r ./scripts/corpus_requirements.txt (line 2)) Downloading tqdm-4.11.2-py2.py3-none-any.whl (46kB): 46kB downloaded Requirement already satisfied (use --upgrade to upgrade): cytoolz<0.9,>=0.8 in /usr/local/lib/python2.7/dist-packages (from thinc<6.6.0,>=6.5.0->spacy==1.8.2->-r ./scripts/corpus_requirements.txt (line 2)) Downloading/unpacking termcolor (from thinc<6.6.0,>=6.5.0->spacy==1.8.2->-r ./scripts/corpus_requirements.txt (line 2)) Downloading termcolor-1.1.0.tar.gz Running setup.py (path:/tmp/pip_build_root/termcolor/setup.py) egg_info for package termcolor

Requirement already satisfied (use --upgrade to upgrade): html5lib in /usr/local/lib/python2.7/dist-packages (from ftfy<5.0.0,>=4.4.2->spacy==1.8.2->-r ./scripts/corpus_requirements.txt (line 2)) Downloading/unpacking wcwidth (from ftfy<5.0.0,>=4.4.2->spacy==1.8.2->-r ./scripts/corpus_requirements.txt (line 2)) Downloading wcwidth-0.1.7-py2.py3-none-any.whl Downloading/unpacking toolz>=0.8.0 (from cytoolz<0.9,>=0.8->thinc<6.6.0,>=6.5.0->spacy==1.8.2->-r ./scripts/corpus_requirements.txt (line 2)) Downloading toolz-0.8.2.tar.gz (45kB): 45kB downloaded Running setup.py (path:/tmp/pip_build_root/toolz/setup.py) egg_info for package toolz

Installing collected packages: lxml, murmurhash, preshed, thinc, plac, pathlib, ujson, requests, regex, wrapt, tqdm, termcolor, wcwidth, toolz Found existing installation: lxml 3.3.3 Not uninstalling lxml at /usr/lib/python2.7/dist-packages, owned by OS Running setup.py install for lxml Building lxml version 3.7.3. Building without Cython. Using build configuration of libxslt 1.1.28 building 'lxml.etree' extension x86_64-linux-gnu-gcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -fPIC -I/usr/include/libxml2 -Isrc/lxml/includes -I/usr/include/python2.7 -c src/lxml/lxml.etree.c -o build/temp.linux-x86_64-2.7/src/lxml/lxml.etree.o -w x86_64-linux-gnu-gcc -pthread -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-Bsymbolic-functions -Wl,-z,relro -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -D_FORTIFY_SOURCE=2 -g -fstack-protector --param=ssp-buffer-size=4 -Wformat -Werror=format-security build/temp.linux-x86_64-2.7/src/lxml/lxml.etree.o -lxslt -lexslt -lxml2 -lrt -lz -lm -o build/lib.linux-x86_64-2.7/lxml/etree.so /usr/bin/ld: cannot find -lz collect2: error: ld returned 1 exit status error: command 'x86_64-linux-gnu-gcc' failed with exit status 1 Complete output from command /usr/bin/python -c "import setuptools, tokenize;file='/tmp/pip_build_root/lxml/setup.py';exec(compile(getattr(tokenize, 'open', open)(file).read().replace('\r\n', '\n'), file, 'exec'))" install --record /tmp/pip-2mP4AR-record/install-record.txt --single-version-externally-managed --compile: Building lxml version 3.7.3.

Building without Cython.

Using build configuration of libxslt 1.1.28

running install

running build

running build_py

creating build

creating build/lib.linux-x86_64-2.7

creating build/lib.linux-x86_64-2.7/lxml

copying src/lxml/builder.py -> build/lib.linux-x86_64-2.7/lxml

copying src/lxml/doctestcompare.py -> build/lib.linux-x86_64-2.7/lxml

copying src/lxml/init.py -> build/lib.linux-x86_64-2.7/lxml

copying src/lxml/_elementpath.py -> build/lib.linux-x86_64-2.7/lxml

copying src/lxml/cssselect.py -> build/lib.linux-x86_64-2.7/lxml

copying src/lxml/pyclasslookup.py -> build/lib.linux-x86_64-2.7/lxml

copying src/lxml/ElementInclude.py -> build/lib.linux-x86_64-2.7/lxml

copying src/lxml/usedoctest.py -> build/lib.linux-x86_64-2.7/lxml

copying src/lxml/sax.py -> build/lib.linux-x86_64-2.7/lxml

creating build/lib.linux-x86_64-2.7/lxml/includes

copying src/lxml/includes/init.py -> build/lib.linux-x86_64-2.7/lxml/includes

creating build/lib.linux-x86_64-2.7/lxml/html

copying src/lxml/html/_diffcommand.py -> build/lib.linux-x86_64-2.7/lxml/html

copying src/lxml/html/diff.py -> build/lib.linux-x86_64-2.7/lxml/html

copying src/lxml/html/builder.py -> build/lib.linux-x86_64-2.7/lxml/html

copying src/lxml/html/formfill.py -> build/lib.linux-x86_64-2.7/lxml/html

copying src/lxml/html/init.py -> build/lib.linux-x86_64-2.7/lxml/html

copying src/lxml/html/html5parser.py -> build/lib.linux-x86_64-2.7/lxml/html

copying src/lxml/html/soupparser.py -> build/lib.linux-x86_64-2.7/lxml/html

copying src/lxml/html/ElementSoup.py -> build/lib.linux-x86_64-2.7/lxml/html

copying src/lxml/html/_setmixin.py -> build/lib.linux-x86_64-2.7/lxml/html

copying src/lxml/html/usedoctest.py -> build/lib.linux-x86_64-2.7/lxml/html

copying src/lxml/html/_html5builder.py -> build/lib.linux-x86_64-2.7/lxml/html

copying src/lxml/html/defs.py -> build/lib.linux-x86_64-2.7/lxml/html

copying src/lxml/html/clean.py -> build/lib.linux-x86_64-2.7/lxml/html

creating build/lib.linux-x86_64-2.7/lxml/isoschematron

copying src/lxml/isoschematron/init.py -> build/lib.linux-x86_64-2.7/lxml/isoschematron

copying src/lxml/lxml.etree.h -> build/lib.linux-x86_64-2.7/lxml

copying src/lxml/lxml.etree_api.h -> build/lib.linux-x86_64-2.7/lxml

copying src/lxml/includes/relaxng.pxd -> build/lib.linux-x86_64-2.7/lxml/includes

copying src/lxml/includes/config.pxd -> build/lib.linux-x86_64-2.7/lxml/includes

copying src/lxml/includes/uri.pxd -> build/lib.linux-x86_64-2.7/lxml/includes

copying src/lxml/includes/xmlerror.pxd -> build/lib.linux-x86_64-2.7/lxml/includes

copying src/lxml/includes/dtdvalid.pxd -> build/lib.linux-x86_64-2.7/lxml/includes

copying src/lxml/includes/xslt.pxd -> build/lib.linux-x86_64-2.7/lxml/includes

copying src/lxml/includes/xmlparser.pxd -> build/lib.linux-x86_64-2.7/lxml/includes

copying src/lxml/includes/tree.pxd -> build/lib.linux-x86_64-2.7/lxml/includes

copying src/lxml/includes/htmlparser.pxd -> build/lib.linux-x86_64-2.7/lxml/includes

copying src/lxml/includes/xpath.pxd -> build/lib.linux-x86_64-2.7/lxml/includes

copying src/lxml/includes/c14n.pxd -> build/lib.linux-x86_64-2.7/lxml/includes

copying src/lxml/includes/xmlschema.pxd -> build/lib.linux-x86_64-2.7/lxml/includes

copying src/lxml/includes/xinclude.pxd -> build/lib.linux-x86_64-2.7/lxml/includes

copying src/lxml/includes/etreepublic.pxd -> build/lib.linux-x86_64-2.7/lxml/includes

copying src/lxml/includes/schematron.pxd -> build/lib.linux-x86_64-2.7/lxml/includes

copying src/lxml/includes/lxml-version.h -> build/lib.linux-x86_64-2.7/lxml/includes

copying src/lxml/includes/etree_defs.h -> build/lib.linux-x86_64-2.7/lxml/includes

creating build/lib.linux-x86_64-2.7/lxml/isoschematron/resources

creating build/lib.linux-x86_64-2.7/lxml/isoschematron/resources/rng

copying src/lxml/isoschematron/resources/rng/iso-schematron.rng -> build/lib.linux-x86_64-2.7/lxml/isoschematron/resources/rng

creating build/lib.linux-x86_64-2.7/lxml/isoschematron/resources/xsl

copying src/lxml/isoschematron/resources/xsl/RNG2Schtrn.xsl -> build/lib.linux-x86_64-2.7/lxml/isoschematron/resources/xsl

copying src/lxml/isoschematron/resources/xsl/XSD2Schtrn.xsl -> build/lib.linux-x86_64-2.7/lxml/isoschematron/resources/xsl

creating build/lib.linux-x86_64-2.7/lxml/isoschematron/resources/xsl/iso-schematron-xslt1

copying src/lxml/isoschematron/resources/xsl/iso-schematron-xslt1/iso_dsdl_include.xsl -> build/lib.linux-x86_64-2.7/lxml/isoschematron/resources/xsl/iso-schematron-xslt1

copying src/lxml/isoschematron/resources/xsl/iso-schematron-xslt1/iso_abstract_expand.xsl -> build/lib.linux-x86_64-2.7/lxml/isoschematron/resources/xsl/iso-schematron-xslt1

copying src/lxml/isoschematron/resources/xsl/iso-schematron-xslt1/iso_svrl_for_xslt1.xsl -> build/lib.linux-x86_64-2.7/lxml/isoschematron/resources/xsl/iso-schematron-xslt1

copying src/lxml/isoschematron/resources/xsl/iso-schematron-xslt1/iso_schematron_message.xsl -> build/lib.linux-x86_64-2.7/lxml/isoschematron/resources/xsl/iso-schematron-xslt1

copying src/lxml/isoschematron/resources/xsl/iso-schematron-xslt1/iso_schematron_skeleton_for_xslt1.xsl -> build/lib.linux-x86_64-2.7/lxml/isoschematron/resources/xsl/iso-schematron-xslt1

copying src/lxml/isoschematron/resources/xsl/iso-schematron-xslt1/readme.txt -> build/lib.linux-x86_64-2.7/lxml/isoschematron/resources/xsl/iso-schematron-xslt1

running build_ext

building 'lxml.etree' extension

creating build/temp.linux-x86_64-2.7

creating build/temp.linux-x86_64-2.7/src

creating build/temp.linux-x86_64-2.7/src/lxml

x86_64-linux-gnu-gcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -fPIC -I/usr/include/libxml2 -Isrc/lxml/includes -I/usr/include/python2.7 -c src/lxml/lxml.etree.c -o build/temp.linux-x86_64-2.7/src/lxml/lxml.etree.o -w

x86_64-linux-gnu-gcc -pthread -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-Bsymbolic-functions -Wl,-z,relro -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -D_FORTIFY_SOURCE=2 -g -fstack-protector --param=ssp-buffer-size=4 -Wformat -Werror=format-security build/temp.linux-x86_64-2.7/src/lxml/lxml.etree.o -lxslt -lexslt -lxml2 -lrt -lz -lm -o build/lib.linux-x86_64-2.7/lxml/etree.so

/usr/bin/ld: cannot find -lz

collect2: error: ld returned 1 exit status

error: command 'x86_64-linux-gnu-gcc' failed with exit status 1


Can't roll back lxml; was not uninstalled Cleaning up... Command /usr/bin/python -c "import setuptools, tokenize;file='/tmp/pip_build_root/lxml/setup.py';exec(compile(getattr(tokenize, 'open', open)(file).read().replace('\r\n', '\n'), file, 'exec'))" install --record /tmp/pip-2mP4AR-record/install-record.txt --single-version-externally-managed --compile failed with error code 1 in /tmp/pip_build_root/lxml

judithek commented 7 years ago

found the solution on the internet: http://stackoverflow.com/questions/5178416/pip-install-lxml-error

Installing zlib1g-dev resolved the issue.

So to sum up: I fixed my local Ubuntu setup by running

sudo apt-get install libxml2-dev libxslt-dev python-dev sudo apt-get install -y zlib1g-dev

The script runs through now but outputs many warnings. Last message:

Successfully installed lxml murmurhash preshed thinc plac pathlib ujson requests regex wrapt tqdm termcolor wcwidth toolz Cleaning up... /usr/bin/python: No module named webencodings; 'spacy' is a package and cannot be directly executed

judithek commented 7 years ago

correction: script does not run through, but stops at

python -m spacy download en

with error message: 'spacy' is a package and cannot be directly executed)

judithek commented 7 years ago

this issue is resolved,

problem above reported in https://github.com/gabrielStanovsky/unified-factuality/issues/12