ArchiveTeam / grab-site

The archivist's web crawler: WARC output, dashboard for all crawls, dynamic ignore patterns
Other
1.31k stars 129 forks source link

[wpull] 'cython_function_or_method' object has no attribute 'lower' #173

Open tempname1024 opened 3 years ago

tempname1024 commented 3 years ago
$ grab-site --version
2.2.0

[ ... ]
https://tools.ietf.org/rfcdiff?url2=draft-ietf-sipcore-presence-scaling-requirements-02.txt ...
https://tools.ietf.org/html/draft-ietf-sipcore-presence-scaling-requirements-01 ...
DUPE https://tools.ietf.org/html/draft-ietf-sipcore-presence-scaling-requirements-01
  OF https://tools.ietf.org/html/draft-ietf-sipcore-presence-scaling-requirements-01
https://tools.ietf.org/id/draft-ietf-sipcore-presence-scaling-requirements-02.xml ...
ERROR Fatal exception.
Traceback (most recent call last):
  File "/home/jordan/apps/gs/lib/python3.7/site-packages/wpull/application/app.py", line 157, in run
    yield from pipeline.process()
  File "/home/jordan/apps/gs/lib/python3.7/site-packages/wpull/pipeline/pipeline.py", line 194, in process
    yield from self._process_one_worker()
  File "/home/jordan/apps/gs/lib/python3.7/site-packages/wpull/pipeline/pipeline.py", line 215, in _process_one_worker
    task.result()
  File "/home/jordan/apps/gs/lib/python3.7/site-packages/wpull/pipeline/pipeline.py", line 119, in process
    item = yield from self.process_one(_worker_id=worker_id)
  File "/home/jordan/apps/gs/lib/python3.7/site-packages/wpull/pipeline/pipeline.py", line 103, in process_one
    yield from task.process(item)
  File "/home/jordan/apps/gs/lib/python3.7/site-packages/wpull/application/tasks/download.py", line 421, in process
    yield from session.app_session.factory['Processor'].process(session)
  File "/home/jordan/apps/gs/lib/python3.7/site-packages/wpull/processor/delegate.py", line 29, in process
    return (yield from processor.process(item_session))
  File "/home/jordan/apps/gs/lib/python3.7/site-packages/wpull/processor/web.py", line 91, in process
    return (yield from session.process())
  File "/home/jordan/apps/gs/lib/python3.7/site-packages/wpull/processor/web.py", line 185, in process
    yield from self._process_loop()
  File "/home/jordan/apps/gs/lib/python3.7/site-packages/wpull/processor/web.py", line 244, in _process_loop
    exit_early, wait_time = yield from self._fetch_one(cast(Request, self._item_session.request))
  File "/home/jordan/apps/gs/lib/python3.7/site-packages/wpull/processor/web.py", line 308, in _fetch_one
    action = self._handle_response(request, response)
  File "/home/jordan/apps/gs/lib/python3.7/site-packages/wpull/processor/web.py", line 423, in _handle_response
    self._processing_rule.scrape_document(self._item_session)
  File "/home/jordan/apps/gs/lib/python3.7/site-packages/libgrabsite/wpull_tweaks.py", line 55, in scrape_document
    super().scrape_document(item_session)
  File "/home/jordan/apps/gs/lib/python3.7/site-packages/wpull/processor/rule.py", line 527, in scrape_document
    item_session.url_record.link_type
  File "/home/jordan/apps/gs/lib/python3.7/site-packages/wpull/scraper/base.py", line 186, in scrape_info
    scrape_result = scraper.scrape(request, response, link_type)
  File "/home/jordan/apps/gs/lib/python3.7/site-packages/wpull/scraper/html.py", line 114, in scrape
    elements, response, base_url, link_contexts
  File "/home/jordan/apps/gs/lib/python3.7/site-packages/wpull/scraper/html.py", line 176, in _process_elements
    if not self._is_accepted(link_info.tag):
  File "/home/jordan/apps/gs/lib/python3.7/site-packages/wpull/scraper/html.py", line 257, in _is_accepted
    element_tag = element_tag.lower()
AttributeError: 'cython_function_or_method' object has no attribute 'lower'
CRITICAL Sorry, Wpull unexpectedly crashed.