Per #43, this is my take at replacing the custom retry code with the tried and tested urllib3 Retry class. Benefits include that it also handles Timeout and ConnectionError, and that the sleep time increases for each retry. Tests pass, and I'm also testing the branch in a real world scenario now, where I need to harvest from a server which is quite unstable, and happens to produce both timeouts, connection errors and 500 errors during updates. It seems to work fine so far!
A slight inconvenience of retrying upon ConnectionError, is that it will also retry if you misspell the hostname. Personally, I think the benefits of handling ConnectionError outweighs that incovenience, but it could be something to discuss. If the rest of the URL contains a typo, so that the server returns a 404, it will of course still raise an error immediately.
Perhaps the major issue here is that the default_retry_after argument can no longer be supported. It has been replaced by a new argument retry_backoff_factor, to control the exponential backoff factor, but the replacement will be a breaking change since the old argument cannot be converted into the new one.
Also note that I extended the default retry_status_codes with more codes. Since this is "OAI for humans", and humans do not necessarily know which errors to expect up front, I think it makes sense to include some more common error codes by default. But I can leave this extension out of this PR if it's controversial.
Logging: If it's desirable to have logging, one possibility is to enable debug logging for the retry module:
Per #43, this is my take at replacing the custom retry code with the tried and tested urllib3 Retry class. Benefits include that it also handles
Timeout
andConnectionError
, and that the sleep time increases for each retry. Tests pass, and I'm also testing the branch in a real world scenario now, where I need to harvest from a server which is quite unstable, and happens to produce both timeouts, connection errors and 500 errors during updates. It seems to work fine so far!A slight inconvenience of retrying upon
ConnectionError
, is that it will also retry if you misspell the hostname. Personally, I think the benefits of handling ConnectionError outweighs that incovenience, but it could be something to discuss. If the rest of the URL contains a typo, so that the server returns a 404, it will of course still raise an error immediately.Perhaps the major issue here is that the
default_retry_after
argument can no longer be supported. It has been replaced by a new argumentretry_backoff_factor
, to control the exponential backoff factor, but the replacement will be a breaking change since the old argument cannot be converted into the new one.Also note that I extended the default
retry_status_codes
with more codes. Since this is "OAI for humans", and humans do not necessarily know which errors to expect up front, I think it makes sense to include some more common error codes by default. But I can leave this extension out of this PR if it's controversial.Logging: If it's desirable to have logging, one possibility is to enable debug logging for the retry module:
to get log messages like these:
I'm not sure if this should be the responsibility of sickle or the user, though.