simonw / airtable-export

Export Airtable data to YAML, JSON or SQLite files on disk
https://datasette.io/tools/airtable-export
Apache License 2.0
112 stars 15 forks source link

Error handling / retry #7

Open obra opened 3 years ago

obra commented 3 years ago

Per the airtable API docs:

RATE LIMITS
The API is limited to 5 requests per second per base. If you exceed this rate, you will receive a 429 status code and will need to wait 30 seconds before subsequent requests will succeed.

The official JavaScript client has built-in retry logic.

If you anticipate a higher read volume, we recommend using a caching proxy. This rate limit is the same for all plans and increased limits are not currently available.

Possibly related, but possibly unrelated, I've been seeing occasional Error: The read operation timed out in my GitHub Actions airtable backup jobs.

I don't know if there is already internal retry logic and we're just blowing past the retry count or if we're running into something not currently contemplated by the code.

simonw commented 3 years ago

The tool currently sleeps for 0.2 seconds between requests to try and stay within the 5 seconds per base limit, but it doesn't do anything smarter than that:

https://github.com/simonw/airtable-export/blob/6e27758ef67ee69b452edd706638216436b9f3a8/airtable_export/cli.py#L74-L90

Ideally it would catch those 429 status codes, delay and then retry - potentially with exponential backoff of some kind.

simonw commented 3 years ago

Also worth noting:

Iteration may timeout due to client inactivity or server restarts. In that case, the client will receive a 422 response with error message LIST_RECORDS_ITERATOR_NOT_AVAILABLE. It may then restart iteration from the beginning.