dragnet-org / dragnet

Just the facts -- web page content extraction
MIT License
1.25k stars 179 forks source link

ModuleNotFoundError is returned when I import dragnet as dependency #81

Open fabiana001 opened 5 years ago

fabiana001 commented 5 years ago

Hi, I have a python project with the following setup.py:

setup(name='my_package',
      version='0.0.1',
      description='',
      author='',
      author_email='',
      packages=find_packages(where='src'),
      package_dir={'':'src'},
      install_requires=[
          'dragnet',
          'spacy',
          'pandas',
          'pytest'
      ]
)

I'm using python 3.7

When I try to install the project dependencies I have the following error:

Searching for dragnet
Reading https://pypi.org/simple/dragnet/
Downloading https://files.pythonhosted.org/packages/2f/8c/3ae7c2824d612555bc936a0fac43568e8c3a9d4e58a88565b8a6b2a1dc7e/dragnet-2.0.3.tar.gz#sha256=58790e43f670d58277569568b4ca8e70675e985432278c0603a805ca5c9c21b7
Best match: dragnet 2.0.3
Processing dragnet-2.0.3.tar.gz
Writing /var/folders/6h/7qdbkwd17cvd43rgl46c4jr40000gp/T/easy_install-smim9159/dragnet-2.0.3/setup.cfg
Running dragnet-2.0.3/setup.py -q bdist_egg --dist-dir /var/folders/6h/7qdbkwd17cvd43rgl46c4jr40000gp/T/easy_install-smim9159/dragnet-2.0.3/egg-dist-tmp-r55dxunp
Traceback (most recent call last):
  File "/Users/lanottef/miniconda3/envs/my_package/lib/python3.7/site-packages/setuptools/sandbox.py", line 154, in save_modules
    yield saved
  File "/Users/lanottef/miniconda3/envs/my_package/lib/python3.7/site-packages/setuptools/sandbox.py", line 195, in setup_context
    yield
  File "/Users/lanottef/miniconda3/envs/my_package/lib/python3.7/site-packages/setuptools/sandbox.py", line 250, in run_setup
    _execfile(setup_script, ns)
  File "/Users/lanottef/miniconda3/envs/my_package/lib/python3.7/site-packages/setuptools/sandbox.py", line 45, in _execfile
    exec(code, globals, locals)
  File "/var/folders/6h/7qdbkwd17cvd43rgl46c4jr40000gp/T/easy_install-smim9159/dragnet-2.0.3/setup.py", line 25, in <module>
ModuleNotFoundError: No module named 'lxml'

I obtain the same error if I put lxml in the install_requires param (obviously I put lxml before dragnet). What I'm doing wrong?

Yomguithereal commented 5 years ago

Python is a bit quirky with its dependencies. Since dragnet needs to have lxml Cython and numpy installed to be "built", it cannot be installed before those dependencies are installed.

So if you do:

pip install lxml numpy Cython

and then

pip install dragnet

It should then work.