JohnSmithDev / ISFDB-Tools

Tools to query a local copy of the ISFDB database
5 stars 1 forks source link

downloads.py doesn't handle network outages gracefully #16

Open JohnSmithDev opened 5 years ago

JohnSmithDev commented 5 years ago

Every now and again you get something like this:

isfdb_tools $ ./bulk_author_gender.py -y 1930-1950 ... Traceback (most recent call last): File "/proj/isfdb_tools/author_gender.py", line 153, in get_author_gender_from_ids_and_then_name_cached return gagfiatn_cache[cache_key] KeyError: (6624, 'Valentine Davies')

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/proj/book_scraping/lib64/python3.6/site-packages/urllib3/connection.py", line 160, in _new_conn (self._dns_host, self.port), self.timeout, **extra_kw) File "/proj/book_scraping/lib64/python3.6/site-packages/urllib3/util/connection.py", line 57, in create_connection for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM): File "/usr/lib64/python3.6/socket.py", line 745, in getaddrinfo for res in _socket.getaddrinfo(host, port, family, type, proto, flags): socket.gaierror: [Errno -2] Name or service not known ... loads more traceback deleted ... requests.exceptions.ConnectionError: HTTPConnectionPool(host='en.wikipedia.org', port=80): Max retries exceeded with url: /wiki/Valentine_Davies (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f33eb18a2b0>: Failed to establish a new connection: [Errno -2] Name or service not known',))

In this context of downloading a bunch of pages from Wikipedia (or Twitter), then ignoring this error maybe isn't that much use, because (a) it seems likely any subsequent downloads from the same site will fail for the same reason, and (b) the stats at the end will likely be skewed.

However, it might be worth logging and exiting more gracefully, rather than spewing out a couple of pages of stack trace.