scrapy / scrapely

A pure-python HTML screen-scraping library
1.86k stars 315 forks source link

Installing via pip on Python 3.7 fails #112

Open bit-chemist opened 5 years ago

bit-chemist commented 5 years ago
➜  beacon-scrapy git:(master) ✗ pip3 install scrapely
Collecting scrapely
  Downloading https://files.pythonhosted.org/packages/5e/8b/dcf53699a4645f39e200956e712180300ec52d2a16a28a51c98e96e76548/scrapely-0.13.4.tar.gz (134kB)
    100% |████████████████████████████████| 143kB 5.1MB/s
Requirement already satisfied: numpy in /usr/local/lib/python3.7/site-packages (from scrapely) (1.15.0)
Requirement already satisfied: w3lib in /usr/local/lib/python3.7/site-packages (from scrapely) (1.19.0)
Requirement already satisfied: six in /usr/local/lib/python3.7/site-packages (from scrapely) (1.11.0)
Building wheels for collected packages: scrapely
  Running setup.py bdist_wheel for scrapely ... error
  Complete output from command /usr/local/opt/python/bin/python3.7 -u -c "import setuptools, tokenize;__file__='/private/var/folders/7c/dm671s4x4v5bm8_6tprr861r0000gn/T/pip-install-p7z3xbo1/scrapely/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" bdist_wheel -d /private/var/folders/7c/dm671s4x4v5bm8_6tprr861r0000gn/T/pip-wheel-7rl3xgbc --python-tag cp37:
  running bdist_wheel
  running build
  running build_py
  creating build
  creating build/lib.macosx-10.13-x86_64-3.7
  creating build/lib.macosx-10.13-x86_64-3.7/scrapely
  copying scrapely/descriptor.py -> build/lib.macosx-10.13-x86_64-3.7/scrapely
  copying scrapely/version.py -> build/lib.macosx-10.13-x86_64-3.7/scrapely
  copying scrapely/extractors.py -> build/lib.macosx-10.13-x86_64-3.7/scrapely
  copying scrapely/__init__.py -> build/lib.macosx-10.13-x86_64-3.7/scrapely
  copying scrapely/template.py -> build/lib.macosx-10.13-x86_64-3.7/scrapely
  copying scrapely/htmlpage.py -> build/lib.macosx-10.13-x86_64-3.7/scrapely
  copying scrapely/tool.py -> build/lib.macosx-10.13-x86_64-3.7/scrapely
  creating build/lib.macosx-10.13-x86_64-3.7/scrapely/extraction
  copying scrapely/extraction/pageobjects.py -> build/lib.macosx-10.13-x86_64-3.7/scrapely/extraction
  copying scrapely/extraction/similarity.py -> build/lib.macosx-10.13-x86_64-3.7/scrapely/extraction
  copying scrapely/extraction/__init__.py -> build/lib.macosx-10.13-x86_64-3.7/scrapely/extraction
  copying scrapely/extraction/regionextract.py -> build/lib.macosx-10.13-x86_64-3.7/scrapely/extraction
  copying scrapely/extraction/pageparsing.py -> build/lib.macosx-10.13-x86_64-3.7/scrapely/extraction
  running egg_info
  writing scrapely.egg-info/PKG-INFO
  writing dependency_links to scrapely.egg-info/dependency_links.txt
  writing requirements to scrapely.egg-info/requires.txt
  writing top-level names to scrapely.egg-info/top_level.txt
  reading manifest file 'scrapely.egg-info/SOURCES.txt'
  reading manifest template 'MANIFEST.in'
  writing manifest file 'scrapely.egg-info/SOURCES.txt'
  copying scrapely/_htmlpage.c -> build/lib.macosx-10.13-x86_64-3.7/scrapely
  copying scrapely/_htmlpage.pyx -> build/lib.macosx-10.13-x86_64-3.7/scrapely
  copying scrapely/extraction/_similarity.c -> build/lib.macosx-10.13-x86_64-3.7/scrapely/extraction
  copying scrapely/extraction/_similarity.pyx -> build/lib.macosx-10.13-x86_64-3.7/scrapely/extraction
  running build_ext
  building 'scrapely._htmlpage' extension
  creating build/temp.macosx-10.13-x86_64-3.7
  creating build/temp.macosx-10.13-x86_64-3.7/scrapely
  clang -Wno-unused-result -Wsign-compare -Wunreachable-code -fno-common -dynamic -DNDEBUG -g -fwrapv -O3 -Wall -I/usr/local/lib/python3.7/site-packages/numpy/core/include -I/usr/local/include -I/usr/local/opt/openssl/include -I/usr/local/opt/sqlite/include -I/usr/local/Cellar/python/3.7.0/Frameworks/Python.framework/Versions/3.7/include/python3.7m -c scrapely/_htmlpage.c -o build/temp.macosx-10.13-x86_64-3.7/scrapely/_htmlpage.o
  scrapely/_htmlpage.c:7367:65: error: too many arguments to function call, expected 3, have 4
      return (*((__Pyx_PyCFunctionFast)meth)) (self, args, nargs, NULL);
             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~                     ^~~~
  /Library/Developer/CommandLineTools/usr/lib/clang/9.1.0/include/stddef.h:105:16: note: expanded from macro 'NULL'
  #  define NULL ((void*)0)
                 ^~~~~~~~~~
  1 error generated.
  error: command 'clang' failed with exit status 1
bit-chemist commented 5 years ago

It looks like this is a Cython 0.25.2 / Python 3.7 incompatibility around PyCFunctionFast.
I regenerated _htmlpage.c and _similarity.c using Cython 0.28.4 and it seems to install correctly.

https://github.com/cython/cython/blob/master/CHANGES.rst#0261-2017-08-29

gregorioLee commented 5 years ago

Installing via pip on Python 3.6 fails (test) λ pip install scrapely Collecting scrapely Using cached https://files.pythonhosted.org/packages/5e/8b/dcf53699a4645f39e200956e712180300ec52d2a16a28a51c98e96e76548/scrapely-0.13.4.tar.gz Requirement already satisfied: numpy in d:\python\scripts\test\lib\site-packages (from scrapely) (1.15.2) Requirement already satisfied: w3lib in d:\python\scripts\test\lib\site-packages (from scrapely) (1.19.0) Requirement already satisfied: six in d:\python\scripts\test\lib\site-packages (from scrapely) (1.11.0) Building wheels for collected packages: scrapely Running setup.py bdist_wheel for scrapely ... error Complete output from command d:\python\scripts\test\scripts\python.exe -u -c "import setuptools, tokenize;file='C:\Users\Administrator\AppData\Local\Temp\pip-install-yuug27ug\scrapely\setup.py';f=getattr(tokenize, 'open', open)(file);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, file, 'exec'))" bdist_wheel -d C:\Users\Administrator\AppData\Local\Temp\pip-wheel-y8_2gbc2 --python-tag cp36: running bdist_wheel running build running build_py creating build creating build\lib.win-amd64-3.6 creating build\lib.win-amd64-3.6\scrapely copying scrapely\descriptor.py -> build\lib.win-amd64-3.6\scrapely copying scrapely\extractors.py -> build\lib.win-amd64-3.6\scrapely copying scrapely\htmlpage.py -> build\lib.win-amd64-3.6\scrapely copying scrapely\template.py -> build\lib.win-amd64-3.6\scrapely copying scrapely\tool.py -> build\lib.win-amd64-3.6\scrapely copying scrapely\version.py -> build\lib.win-amd64-3.6\scrapely copying scrapely__init.py -> build\lib.win-amd64-3.6\scrapely creating build\lib.win-amd64-3.6\scrapely\extraction copying scrapely\extraction\pageobjects.py -> build\lib.win-amd64-3.6\scrapely\extraction copying scrapely\extraction\pageparsing.py -> build\lib.win-amd64-3.6\scrapely\extraction copying scrapely\extraction\regionextract.py -> build\lib.win-amd64-3.6\scrapely\extraction copying scrapely\extraction\similarity.py -> build\lib.win-amd64-3.6\scrapely\extraction copying scrapely\extraction\init__.py -> build\lib.win-amd64-3.6\scrapely\extraction running egg_info writing scrapely.egg-info\PKG-INFO writing dependency_links to scrapely.egg-info\dependency_links.txt writing requirements to scrapely.egg-info\requires.txt writing top-level names to scrapely.egg-info\top_level.txt reading manifest file 'scrapely.egg-info\SOURCES.txt' reading manifest template 'MANIFEST.in' writing manifest file 'scrapely.egg-info\SOURCES.txt' copying scrapely_htmlpage.c -> build\lib.win-amd64-3.6\scrapely copying scrapely_htmlpage.pyx -> build\lib.win-amd64-3.6\scrapely copying scrapely\extraction_similarity.c -> build\lib.win-amd64-3.6\scrapely\extraction copying scrapely\extraction_similarity.pyx -> build\lib.win-amd64-3.6\scrapely\extraction running build_ext building 'scrapely._htmlpage' extension error: Microsoft Visual C++ 14.0 is required. Get it with "Microsoft Visual C++ Build Tools": https://visualstudio.microsoft.com/downloads/


Failed building wheel for scrapely Running setup.py clean for scrapely Failed to build scrapely Installing collected packages: scrapely Running setup.py install for scrapely ... error Complete output from command d:\python\scripts\test\scripts\python.exe -u -c "import setuptools, tokenize;file='C:\Users\Administrator\AppData\Local\Temp\pip-install-yuug27ug\scrapely\setup.py';f=getattr(tokenize, 'open', open)(file);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, file, 'exec'))" install --record C:\Users\Administrator\AppData\Local\Temp\pip-record-dqs_ztoo\install-record.txt --single-version-externally-managed --compile --install-headers d:\python\scripts\test\include\site\python3.6\scrapely: running install running build running build_py creating build creating build\lib.win-amd64-3.6 creating build\lib.win-amd64-3.6\scrapely copying scrapely\descriptor.py -> build\lib.win-amd64-3.6\scrapely copying scrapely\extractors.py -> build\lib.win-amd64-3.6\scrapely copying scrapely\htmlpage.py -> build\lib.win-amd64-3.6\scrapely copying scrapely\template.py -> build\lib.win-amd64-3.6\scrapely copying scrapely\tool.py -> build\lib.win-amd64-3.6\scrapely copying scrapely\version.py -> build\lib.win-amd64-3.6\scrapely copying scrapely__init.py -> build\lib.win-amd64-3.6\scrapely creating build\lib.win-amd64-3.6\scrapely\extraction copying scrapely\extraction\pageobjects.py -> build\lib.win-amd64-3.6\scrapely\extraction copying scrapely\extraction\pageparsing.py -> build\lib.win-amd64-3.6\scrapely\extraction copying scrapely\extraction\regionextract.py -> build\lib.win-amd64-3.6\scrapely\extraction copying scrapely\extraction\similarity.py -> build\lib.win-amd64-3.6\scrapely\extraction copying scrapely\extraction\init__.py -> build\lib.win-amd64-3.6\scrapely\extraction running egg_info writing scrapely.egg-info\PKG-INFO writing dependency_links to scrapely.egg-info\dependency_links.txt writing requirements to scrapely.egg-info\requires.txt writing top-level names to scrapely.egg-info\top_level.txt reading manifest file 'scrapely.egg-info\SOURCES.txt' reading manifest template 'MANIFEST.in' writing manifest file 'scrapely.egg-info\SOURCES.txt' copying scrapely_htmlpage.c -> build\lib.win-amd64-3.6\scrapely copying scrapely_htmlpage.pyx -> build\lib.win-amd64-3.6\scrapely copying scrapely\extraction_similarity.c -> build\lib.win-amd64-3.6\scrapely\extraction copying scrapely\extraction_similarity.pyx -> build\lib.win-amd64-3.6\scrapely\extraction running build_ext building 'scrapely._htmlpage' extension error: Microsoft Visual C++ 14.0 is required. Get it with "Microsoft Visual C++ Build Tools": https://visualstudio.microsoft.com/downloads/

----------------------------------------

Command "d:\python\scripts\test\scripts\python.exe -u -c "import setuptools, tokenize;file='C:\Users\Administrator\AppData\Local\Temp\pip-install-yuug27ug\scrapely\setup.py';f=getattr(tokenize, 'open', open)(file);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, file, 'exec'))" install --record C:\Users\Administrator\AppData\Local\Temp\pip-record-dqs_ztoo\install-record.txt --single-version-externally-managed --compile --install-headers d:\python\scripts\test\include\site\python3.6\scrapely" failed with error code 1 in C:\Users\Administrator\AppData\Local\Temp\pip-install-yuug27ug\scrapely\

marekyggdrasil commented 4 years ago

my system

$ sw_vers
ProductName:    Mac OS X
ProductVersion: 10.14.5
BuildVersion:   18F132

my Python

$ python -V
Python 3.7.4

my Scrapely installation output via pip

$ pip install scrapely
Collecting scrapely
  Downloading https://files.pythonhosted.org/packages/c2/2e/d03841a9a0278598684ca30710c810a63b2e617fca07ee3af53e3305af1f/scrapely-0.14.1.tar.gz (155kB)
    100% |████████████████████████████████| 163kB 847kB/s 
Requirement already satisfied: numpy in ./.pyenv/versions/3.7.4/lib/python3.7/site-packages (from scrapely) (1.17.4)
Collecting w3lib (from scrapely)
  Using cached https://files.pythonhosted.org/packages/6a/45/1ba17c50a0bb16bd950c9c2b92ec60d40c8ebda9f3371ae4230c437120b6/w3lib-1.21.0-py2.py3-none-any.whl
Requirement already satisfied: six in ./.pyenv/versions/3.7.4/lib/python3.7/site-packages (from scrapely) (1.13.0)
Installing collected packages: w3lib, scrapely
  Running setup.py install for scrapely ... done
Successfully installed scrapely-0.14.1 w3lib-1.21.0
You are using pip version 19.0.3, however version 19.3.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.

it seems to be working just fine, @bit-chemist can you try one more time? it has been quite a while since Aug 1, 2018.