corradio / polynomial

A central place to track your most vital KPIs
https://polynomial.so
9 stars 2 forks source link

Gracefully handle 429 errors #58

Closed corradio closed 8 months ago

corradio commented 1 year ago

In tasks.py, every task should gracefully handle 429 errors, and save intermediary results.

requests.exceptions.HTTPError: 429 Client Error: Too Many Requests for url
corradio commented 1 year ago

For now we're making sure any HTTP error triggers a retry of the task. Furthermore, as we're now using generators, the measurements should be added (streamed) to the db as they are fetched. This means a task partially completed should be smaller when the retry happens, as the data has been inserted in db already.

corradio commented 1 year ago

Note to self: the complexity lies in the fact that integrations yield datapoints, and not tasks. Therefore, tasks can't be chunked up and retried (there's no such thing as a task). The only thing we can do is retry the whole task (we can't chunk it up further) -- which might be fine as long as the code resumes from where it last worked.

corradio commented 1 year ago

Note that Plausible limits to 600 requests per hour, and the backfill task is current set to retry 5 times, with a backoff of 10s (which will double backoff 5x -> 32 x 10 = 320s = 5min). The backfill task will therefore not allow waiting an hour to try again.

9 retries (512 x 10 = 5120s = 85min) should be ok.

corradio commented 8 months ago

Note that for now, backfill retry causes the same range to be retried and thus we refetch data already previously fetched. A better strategy would be to remember how far the backfill task went and use that as the starting point for the retry.