openzim / wikihow

WikiHow scraper
https://download.kiwix.org/zim/wikihow/
GNU General Public License v3.0
15 stars 2 forks source link

TypeError: 'NoneType' object is not callable #140

Closed kelson42 closed 1 year ago

kelson42 commented 1 year ago

https://farm.openzim.org/pipeline/a15bc951c0252ca55c725236

Exception ignored in: <function MagicDetect.__del__ at 0x7f335b7e1b80>
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/magic.py", line 308, in __del__
  File "/usr/local/lib/python3.8/site-packages/magic.py", line 135, in close
TypeError: 'NoneType' object is not callable
rgaudin commented 1 year ago

Duplicate of #139

[MainThread::2022-09-20 16:05:55,974] INFO:>> Article:Play-"Mijn-Oma-Is-Jarig"
[MainThread::2022-09-20 16:09:37,781] ERROR:Interrupting process due to error: ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))
[MainThread::2022-09-20 16:09:37,781] ERROR:('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/urllib3/connectionpool.py", line 703, in urlopen
    httplib_response = self._make_request(
  File "/usr/local/lib/python3.8/site-packages/urllib3/connectionpool.py", line 449, in _make_request
    six.raise_from(e, None)
  File "<string>", line 3, in raise_from
  File "/usr/local/lib/python3.8/site-packages/urllib3/connectionpool.py", line 444, in _make_request
    httplib_response = conn.getresponse()
  File "/usr/local/lib/python3.8/http/client.py", line 1348, in getresponse
    response.begin()
  File "/usr/local/lib/python3.8/http/client.py", line 316, in begin
    version, status, reason = self._read_status()
  File "/usr/local/lib/python3.8/http/client.py", line 277, in _read_status
    line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
  File "/usr/local/lib/python3.8/socket.py", line 669, in readinto
    return self._sock.recv_into(b)
  File "/usr/local/lib/python3.8/ssl.py", line 1241, in recv_into
    return self.read(nbytes, buffer)
  File "/usr/local/lib/python3.8/ssl.py", line 1099, in read
    return self._sslobj.read(len, buffer)
ConnectionResetError: [Errno 104] Connection reset by peer

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/requests/adapters.py", line 439, in send
    resp = conn.urlopen(
  File "/usr/local/lib/python3.8/site-packages/urllib3/connectionpool.py", line 787, in urlopen
    retries = retries.increment(
  File "/usr/local/lib/python3.8/site-packages/urllib3/util/retry.py", line 550, in increment
    raise six.reraise(type(error), error, _stacktrace)
  File "/usr/local/lib/python3.8/site-packages/urllib3/packages/six.py", line 769, in reraise
    raise value.with_traceback(tb)
  File "/usr/local/lib/python3.8/site-packages/urllib3/connectionpool.py", line 703, in urlopen
    httplib_response = self._make_request(
  File "/usr/local/lib/python3.8/site-packages/urllib3/connectionpool.py", line 449, in _make_request
    six.raise_from(e, None)
  File "<string>", line 3, in raise_from
  File "/usr/local/lib/python3.8/site-packages/urllib3/connectionpool.py", line 444, in _make_request
    httplib_response = conn.getresponse()
  File "/usr/local/lib/python3.8/http/client.py", line 1348, in getresponse
    response.begin()
  File "/usr/local/lib/python3.8/http/client.py", line 316, in begin
    version, status, reason = self._read_status()
  File "/usr/local/lib/python3.8/http/client.py", line 277, in _read_status
    line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
  File "/usr/local/lib/python3.8/socket.py", line 669, in readinto
    return self._sock.recv_into(b)
  File "/usr/local/lib/python3.8/ssl.py", line 1241, in recv_into
    return self.read(nbytes, buffer)
  File "/usr/local/lib/python3.8/ssl.py", line 1099, in read
    return self._sslobj.read(len, buffer)
urllib3.exceptions.ProtocolError: ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/wikihow2zim-1.2.1-py3.8.egg/wikihow2zim/scraper.py", line 991, in run
    self.scrape_articles()
  File "/usr/local/lib/python3.8/site-packages/wikihow2zim-1.2.1-py3.8.egg/wikihow2zim/scraper.py", line 513, in scrape_articles
    if not self.scrape_article(article):
  File "/usr/local/lib/python3.8/site-packages/wikihow2zim-1.2.1-py3.8.egg/wikihow2zim/scraper.py", line 615, in scrape_article
    soup, _ = get_soup(f"/{article}")
  File "/usr/local/lib/python3.8/site-packages/wikihow2zim-1.2.1-py3.8.egg/wikihow2zim/utils.py", line 148, in get_soup
    content, paths = fetch(path, **params)
  File "/usr/local/lib/python3.8/site-packages/wikihow2zim-1.2.1-py3.8.egg/wikihow2zim/utils.py", line 75, in fetch
    resp = Global.session.get(get_url(path, **params), params=params)
  File "/usr/local/lib/python3.8/site-packages/requests/sessions.py", line 555, in get
    return self.request('GET', url, **kwargs)
  File "/usr/local/lib/python3.8/site-packages/requests/sessions.py", line 542, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/local/lib/python3.8/site-packages/requests/sessions.py", line 655, in send
    r = adapter.send(request, **kwargs)
  File "/usr/local/lib/python3.8/site-packages/requests/adapters.py", line 498, in send
    raise ConnectionError(err, request=request)

Issue is #122