jacksongoode / NIME-proceedings-analyzer

A tool for the bibliographic analysis of the NIME proceedings archive
GNU General Public License v3.0
7 stars 3 forks source link

urllib3.exceptions.SSLError when downloading nime2013_113.pdf #4

Closed stefanofasciani closed 4 months ago

stefanofasciani commented 2 years ago

The paper "Now on: New Interfaces for Traditional Korean Music and Dance" (http://www.nime.org/proceedings/2013/nime2013_113.pdf) triggers the following exception:

urllib3.exceptions.SSLError: [SSL: KRB5_S_TKT_NYV] unexpected eof while reading (_ssl.c:2570)

at line 201 of https://github.com/jacksongoode/NIME-proceedings-analyzer/blob/main/pa_extract.py

According to other users affected by same/similar problem, it may be due to a bug of openssl (on a specific version) on the server side.

However, this issue happens only with nime2013_113.pdf, which is the largest pdf in the nime proceedings (80+ MB), which it has not yet been fixed https://github.com/NIME-conference/NIME-bibliography/issues/61 .

This issue may not affect all machines. At the moment a simple workaround is to manually download http://www.nime.org/proceedings/2013/nime2013_113.pdf and move into cache/pdf/ .

Wrapping line 225 of pa_extract.py with try/except solves the issue, but then it triggers another "file not found" error (i.e. the whole follow up code has to be change to work without the downloaded pdf file).

To replicate the issue it is sufficient to run the following lines:

import requests url = "http://www.nime.org/proceedings/2013/nime2013_113.pdf" r = requests.get(url, allow_redirects=True) open("./test.pdf", 'wb').write(r.content)