I noticed that the URLStorage test_exists() test would sometimes hang indefinitely on my local machine. Turns out it was happening on UNREACHABLE_URL, an undefined coralnet URL which is supposed to return 404. Seems that was happening because urlopen(), used in URLStorage's exists(), doesn't define a timeout by default. And of course, the coralnet site has been known (especially recently) to be unresponsive for minutes at times.
I don't know whether the deploy-API jobs running on Batch had timeouts or not, but for what it's worth, I haven't seen any TimeoutError logged in the past couple of months (i.e. since urlopen() replaced wget here in production). One would think a lack of timeout there could endanger our Batch resources. I'm not sure what are the chances that could also propagate to endangering the web server's responsiveness; it might.
What I do know is that my local environment seems to benefit from an explicit timeout in URLStorage's urlopen() usages, and that it wouldn't hurt to have for production as well. So that's what this PR adds.
I took some time to mess around with Python sockets to confirm the semantics of the timeout kwarg. It seems that it gets a URLError if the initial part of the response doesn't arrive by the timeout, and it gets a socket.timeout (superseded by and aliased to TimeoutError in Python 3.10, which is coralnet's Python version) if there's a continuous idle period surpassing the timeout in the middle of sending the response. It is not a timeout enforced against the entire duration of the response, so if there is just a lot of data to send over a low-bandwidth connection, then that's not a concern.
URLStorage test updates in the meantime:
Use mocking instead of an UNREACHABLE_URL. This way we can test the code paths for a 404 and a timeout without relying on the status/characteristics of a particular live website.
Also use mocking instead of an UNREACHABLE_DOMAIN. If CI doesn't have internet connectivity, then it gets a different error because it can't do a DNS lookup at all.
Apply @require_test_fixtures to fewer tests so that some of these updated tests can be run in CI.
I noticed that the URLStorage
test_exists()
test would sometimes hang indefinitely on my local machine. Turns out it was happening onUNREACHABLE_URL
, an undefined coralnet URL which is supposed to return 404. Seems that was happening becauseurlopen()
, used in URLStorage'sexists()
, doesn't define a timeout by default. And of course, the coralnet site has been known (especially recently) to be unresponsive for minutes at times.I don't know whether the deploy-API jobs running on Batch had timeouts or not, but for what it's worth, I haven't seen any
TimeoutError
logged in the past couple of months (i.e. sinceurlopen()
replacedwget
here in production). One would think a lack of timeout there could endanger our Batch resources. I'm not sure what are the chances that could also propagate to endangering the web server's responsiveness; it might.What I do know is that my local environment seems to benefit from an explicit timeout in URLStorage's
urlopen()
usages, and that it wouldn't hurt to have for production as well. So that's what this PR adds.I took some time to mess around with Python sockets to confirm the semantics of the timeout kwarg. It seems that it gets a
URLError
if the initial part of the response doesn't arrive by the timeout, and it gets asocket.timeout
(superseded by and aliased toTimeoutError
in Python 3.10, which is coralnet's Python version) if there's a continuous idle period surpassing the timeout in the middle of sending the response. It is not a timeout enforced against the entire duration of the response, so if there is just a lot of data to send over a low-bandwidth connection, then that's not a concern.URLStorage test updates in the meantime:
UNREACHABLE_URL
. This way we can test the code paths for a 404 and a timeout without relying on the status/characteristics of a particular live website.UNREACHABLE_DOMAIN
. If CI doesn't have internet connectivity, then it gets a different error because it can't do a DNS lookup at all.@require_test_fixtures
to fewer tests so that some of these updated tests can be run in CI.