scrapinghub / extruct

Extract embedded metadata from HTML markup
BSD 3-Clause "New" or "Revised" License
847 stars 113 forks source link

RecursionError('maximum recursion depth exceeded while calling a Python object',) #111

Closed ivanprado closed 4 years ago

ivanprado commented 5 years ago

Rest API not working for extruct 0.7.2

Response for requests are:

{
url: "https://nerdist.com/article/star-wars-cast-reylo-episode-ix/",
status: "error",
message: "RecursionError('maximum recursion depth exceeded while calling a Python object',)"
}

A warning is shown at startup:

python -m extruct.service
/home/ivan/Documentos/scrapinghub/dev/extruct/extruct/service.py:3: MonkeyPatchWarning: Monkey-patching ssl after ssl has already been imported may lead to errors, including RecursionError on Python 3.6. It may also silently lead to incorrect behaviour on Python 3.7. Please monkey-patch earlier. See https://github.com/gevent/gevent/issues/1016. Modules that had direct imports (NOT patched): ['urllib3.util (/home/ivan/Documentos/scrapinghub/dev/extruct/venv/lib/python3.6/site-packages/urllib3/util/__init__.py)', 'urllib3.util.ssl_ (/home/ivan/Documentos/scrapinghub/dev/extruct/venv/lib/python3.6/site-packages/urllib3/util/ssl_.py)']. 
  monkey.patch_all()
Bottle v0.12.16 server starting up (using GeventServer())...
Listening on http://0.0.0.0:10005/
Hit Ctrl-C to quit.

A possible solution could be in this message: https://github.com/gevent/gevent/issues/1235#issuecomment-395423091

pip list:

Package        Version    Location                                     
-------------- ---------- ---------------------------------------------
atomicwrites   1.3.0      
attrs          18.2.0     
beautifulsoup4 4.7.1      
bottle         0.12.16    
bumpversion    0.5.3      
certifi        2018.11.29 
chardet        3.0.4      
entrypoints    0.3        
extruct        0.7.2      
filelock       3.0.10     
flake8         3.7.5      
gevent         1.4.0      
greenlet       0.4.15     
html5lib       1.0.1      
idna           2.8        
isodate        0.6.0      
lxml           4.3.0      
mccabe         0.6.1      
mf2py          1.1.2      
more-itertools 5.0.0      
pip            10.0.1     
pluggy         0.8.1      
py             1.7.0      
pycodestyle    2.5.0      
pyflakes       2.1.0      
pyparsing      2.3.1      
pytest         4.2.0      
rdflib         4.2.2      
rdflib-jsonld  0.4.0      
requests       2.21.0     
setuptools     39.1.0     
six            1.12.0     
soupsieve      1.7.3      
toml           0.10.0     
tox            3.7.0      
urllib3        1.24.1     
virtualenv     16.3.0     
w3lib          1.20.0     
webencodings   0.5.1
kmike commented 5 years ago

see also: https://github.com/scrapinghub/extruct/issues/63