Closed ivanprado closed 3 years ago
More on that: scrapinghub-autoextract raises a RetryError exception when retries fail, but the wrapped exception when retries are disabled.
IMO this is not the right API, because forces the user to handle two cases for each possible error. For example, imagine the developer wants to do something when RequestError
is raised. It has to catch two exceptions:
I propose to always raise the underlying cause (e.g. RequestError).
Ways to implement it:
At least for now I cannot think about a bad side-effect of adding reraise=True
. Let's just ask for opinions from @ejulio and @kmike.
I'll just add that, we should have a way to know it was retried (we should be able to distinguish a non retried error and a retried one).
To some extent, I prefer two types over a property in the class.
One way to circumvent it would be to have RequestError
as base and RequestRetriesExceededError(RequestError)
as another class.
Then, if the user doesn't want to get the different, just catch RequestError
, otherwise they'll have a way to catch both and handle them separately.
Is this fixed by https://github.com/scrapinghub/scrapinghub-autoextract/pull/27?
Thanks, @kmike for pointing this up. This was fixed by #27 . Closing
IMO we should never raise tenacity.RetryError. Instead, we should raise the underlying
RequestError
for request errors, potentially a new exception for network errors (or maybe just reuseRequestError
, or request_processor.get_latest_results() otherwise. We could include the number of retrials in RequestError in case we want to make this information accessible by the client (or even include the tenacity.RetryError inside, as it can include interesting information).More was discussed here: https://github.com/scrapinghub/scrapinghub-autoextract/pull/7#discussion_r454970102