Closed vgreg closed 10 months ago
Thank you for the report @vgreg
Can you please share a reproducible example that caused this error? Seems like the except
's aren't catching whatever error is thrown and then we don't have r
defined. It'd be nice to have an example to figure out what that error is
I am still looking for an example that will consistently reproduce the error. I was retrieving all articles for a set of about 150 journals and had the error occur for two journals, but I have been to re-run the request for both with no issue the second time.
Here is a simplified version of a request that failed once but has been working every other time:
from habanero import Crossref
cr = Crossref()
query = {"issn": "0028-3932"}
responses = cr.works(
filter=query, cursor="*", cursor_max=12000
)
cursor_max
is set to slightly more than the number of DOIs for the journal.
Thanks - I'll see if I can get that to fail
This may be difficult to track down - the facat that it doesn't happen consistently suggests it's an intermittent problem with the Crossref API
I was able to reproduce a similar error and see what gets printed on line 163. Here is the exception:
HTTPSConnectionPool(host='api.crossref.org', port=443): Max retries exceeded with url:
/works?filter=issn%3A0028-3932&cursor=DnF1ZXJ5VGhlbkZldGNoBgAAAAAFuuH-Fmx3VDZUUHY5VHlhdThmaGVtbFhBOVEAAAAABcXycBZPY3FES3VMU1R5R3JIWHlwQUZBcktnAAAAAAXd460WTUpsaGN0RGFRbS1yN0ZYWTJ3MG5pUQAAAAAGAprKFlVTQUNpdVFEVHZLdWVZQWxVZEJDUUEAAAAABbl
xaxY0bldDU3pmSlJZeWhaSGk2VHVVdHh3AAAAAAW0yJAWaUpOMms5em5SUmVMR2JjT2VGdEFtdw%3D%3D (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0xffff33f68e90>, 'Connection to api.crossref.org timed out. (connect
timeout=None)'))
It seems that ConnectTimeout
is derived from RequestException
, so it is caught on line 162:
https://requests.readthedocs.io/en/latest/api/#requests.ConnectionError
However, the code continues to line 164 with r
still undefined.
Okay, thanks for this. I'll try to get to this soon
Having the same issue...
HTTPSConnectionPool(host='api.crossref.org', port=443): Max retries exceeded with url: /works?query=author%3AMONAGHAN+A%2BAND%2Btitle%3A%E2%80%98CALMLY+CRITICAL%E2%80%99%3A+EVOLVING+RUSSIAN+VIEWS+OF+US+HEGEMONY%2BAND%2Byear%3A2006%2BAND%2Bjournal%3AJOURNAL+OF+STRATEGIC+STUDIES&rows=1 (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f3fec6f7f70>: Failed to establish a new connection: [Errno 101] Network is unreachable'))
An error occurred: local variable 'r' referenced before assignment
Let me know if I can help with debugging (but the run continues despite the error...)
thanks for your report @sdspieg ! Sorry about the issue. I started working on this, but I just haven't had time to finish it off. I'll let you know if I could use any help.
@vgreg @sdspieg Can both of you reinstall from Github and try again?
closing for now, if it pops up again ping here
Python 3.11 Habanero 1.2.3
I'm getting the following error on line 164 of
request_class.py
:UnboundLocalError: cannot access local variable 'r' where it is not associated with a value
.It seem that you can reach that line (
check_json(r)
) withr
undefined ifrequests.get()
raises aRequestException
before returning. Because the exception is caught and printed, the code continues withr
still undefined.https://github.com/sckott/habanero/blob/5228483aa101214c4c945c72073d3c8b4d60101e/habanero/request_class.py#L143-L165