zotero / zotero-connectors

Chrome, Firefox, Edge, and Safari extensions for Zotero
https://www.zotero.org/download/connectors
Other
513 stars 121 forks source link

Add automatic 429 retry to async request functions #403

Open dstillman opened 1 year ago

dstillman commented 1 year ago

https://forums.zotero.org/discussion/comment/419513/#Comment_419513

And then we'd want to add this to the specification for request() in utilities_translate.js in zotero/utilities so that other implementations (iOS, desktop, etc.) do the same.

As I note there, we could do something more sophisticated and do this per-domain across requests, but once translators start using the async functions, they'll probably be naturally serialized. (E.g., translators probably shouldn't use Promise.all() anyway.)

adomasven commented 1 year ago

I think we should just do this in utilities_translate.js, in the translate repo, in the Zotero.Utilities.Translate.request() function. That way other Zotero repos can just update the submodule and have the change.

Also HTTP.request() implements and uses a default 15000ms timeout that we don't allow translators to override (in Utilities.Translate.request()). Since this would be multiple requests, they would all get individual 15000ms timeout timers, which I am not sure if we actually want, so we need to consider whether we want to expose the timeout option to translators and how we apply to it in case of a 429 retry.

dstillman commented 1 year ago

I think we should just do this in utilities_translate.js

Yeah, that makes sense.

I don't think the 15s timeout matters to translators themselves. We're not giving translators control over 429 handling to begin with, so they don't know how long a "request" could take. Mostly we're just adding a pause and assuming that requests from translators are serialized so that we don't hammer servers.

One problem is that, to avoid sending two immediate successive requests to the same domain, we have to either track the domain across logical requests (i.e., Utilities.Translate.request() calls) or add in another delay if the retry also gets a 429, since we don't know if there will be another request to the same domain immediately after. But if there's not another request, it's annoying to add in another delay for no reason.

Relatedly, we can respect Retry-After if it's set, but 1) if it's too high, we'd probably want to just fail and 2) if there's another request for the same domain within the specified period, we'd ideally avoid making that request.

So maybe we do need to persist state here and keep track of pauseUntil per domain, with some garbage collection…

But then bringing timeouts and retries and pauses together, can we think of a non-gross way to set a maximum logical request time for a given Zotero.Translate run? E.g., the Connector probably shouldn't retry more than once (or perhaps at all if the first request took too long) and shouldn't pause for that long after a 429, but for something with a progress queue like PDF metadata retrieval or metadata updating, we could easily try a few times and pause for a minute or two for a given domain while processing others…