WolfgangFahl / ConferenceCorpus

ScientificEventCorpus
Apache License 2.0
1 stars 2 forks source link

wikiCFP 500 Internal Server and TimeOut Error Handling #14

Closed WolfgangFahl closed 3 years ago

WolfgangFahl commented 3 years ago
Exception in thread Thread-1:
Traceback (most recent call last):
  File "datasources/wikicfpscrape.py", line 294, in crawl
    rawEvent=wEvent.fromEventId(eventId)
  File "datasources/wikicfpscrape.py", line 479, in fromEventId
    return self.fromUrl(url)
  File "datasources/wikicfpscrape.py", line 558, in fromUrl
    raise Exception(f"fromUrl {url} failed {scrape.err}")
Exception: fromUrl http://www.wikicfp.com/cfp/servlet/event.showcfp?eventid=95961 failed HTTP Error 500: Internal Server Error
WolfgangFahl commented 3 years ago
Exception in thread Thread-10:
Traceback (most recent call last):
  File "/usr/lib/python3.8/threading.py", line 932, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.8/threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "datasources/wikicfpscrape.py", line 294, in crawl
    rawEvent=wEvent.fromEventId(eventId)
  File "datasources/wikicfpscrape.py", line 479, in fromEventId
    return self.fromUrl(url)
  File "datasources/wikicfpscrape.py", line 556, in fromUrl
    triples=scrape.parseRDFa(url)
  File "/hd/sengo/home/wf/source/python/ConferenceCorpus/datasources/webscrape.py", line 142, in parseRDFa
    self.soup=self.getSoup(url, self.showHtml)         
  File "/hd/sengo/home/wf/source/python/ConferenceCorpus/datasources/webscrape.py", line 93, in getSoup
    response = urllib.request.urlopen(url,timeout=self.timeout)
  File "/usr/lib/python3.8/urllib/request.py", line 222, in urlopen
    return opener.open(url, data, timeout)
  File "/usr/lib/python3.8/urllib/request.py", line 525, in open
    response = self._open(req, data)
  File "/usr/lib/python3.8/urllib/request.py", line 542, in _open
    result = self._call_chain(self.handle_open, protocol, protocol +
  File "/usr/lib/python3.8/urllib/request.py", line 502, in _call_chain
    result = func(*args)
  File "/usr/lib/python3.8/urllib/request.py", line 1383, in http_open
    return self.do_open(http.client.HTTPConnection, req)
  File "/usr/lib/python3.8/urllib/request.py", line 1358, in do_open
    r = h.getresponse()
  File "/usr/lib/python3.8/http/client.py", line 1344, in getresponse
    response.begin()
  File "/usr/lib/python3.8/http/client.py", line 307, in begin
    version, status, reason = self._read_status()
  File "/usr/lib/python3.8/http/client.py", line 268, in _read_status
    line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
  File "/usr/lib/python3.8/socket.py", line 669, in readinto
    return self._sock.recv_into(b)
socket.timeout: timed out