Use a default retry strategy and remove `time.sleep`

mesozoic commented 1 year ago

This implements a default retry strategy which will retry 429 (too many requests) up to five times before giving up. This seems like a sensible default, and it is similar to the behavior of the official Airtable.js library. The brief testing I've done suggests that removing time.sleep and replacing with retries does speed up large batch operations.

Given this code...

import os
import time
from threading import Thread
from pyairtable import Api

def main():
    for case in [(1, 500), (5, 100), (25, 20)]:
        timer(*case)

def timer(threads, records):
    start_time = time.time()
    started = []
    for thread_number in range(threads):
        t = Thread(target=bulk_create, args=[thread_number, records])
        t.start()
        started.append(t)

    all(t.join() for t in started)
    duration = time.time() - start_time
    print(threads, "thread(s),", records, "records:", f"{duration:.2f}s")

def bulk_create(prefix, count):
    api = Api(os.environ["AIRTABLE_API_KEY"])
    table = api.table("appaPqizdsNHDvlEm", "TEST_TABLE")
    records = [{"text": f"bulk {prefix} #{n}"} for n in range(1, count)]
    created = table.batch_create(records)
    table.batch_delete([record["id"] for record in created])

if __name__ == "__main__":
    main()

...I got this output:

% git checkout main
% python3 tmp/bulk.py
1 thread(s), 500 records: 53.67s
5 thread(s), 100 records: 12.96s
Exception in thread Thread-14:
...
requests.exceptions.HTTPError: 429 Client Error: Too Many Requests

% git checkout default_retry
% python3 tmp/bulk.py
1 thread(s), 500 records: 32.18s
5 thread(s), 100 records: 10.19s
25 thread(s), 20 records: 10.02s

I'd considered retrying 500s by default, but I think that is probably inadvisable given a 500 might indicate a request was partially fulfilled. It should be left to the implementer to decide whether to retry an operation in that situation or not.

This does not add any new tests. I considered adding something like the above as an integration test, but it would add several seconds to each test run and I'm not clear on what sort of regressions we'd actually be guarding against.

This branch closes #218

mesozoic commented 1 year ago

It seems this code only works with urllib3 >= 1.26, when they changed the signature of the Retry class, whereas the minimum version we test against is 1.25.11. Given that 1.26.0 was released in 2020, I'm going to suggest we bump up the urllib3 dependency for this package instead of trying to remain backwards-compatible.

codecov[bot] commented 1 year ago

Codecov Report

Merging #272 (9f5edbc) into main (9632b6e) will decrease coverage by 0.18%. The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main     #272      +/-   ##
==========================================
- Coverage   94.64%   94.46%   -0.18%     
==========================================
  Files          15       15              
  Lines         765      759       -6     
==========================================
- Hits          724      717       -7     
- Misses         41       42       +1

Impacted Files	Coverage Δ
pyairtable/api/table.py	`98.00% <ø> (-0.10%)`	:arrow_down:
pyairtable/api/api.py	`98.50% <100.00%> (-1.50%)`	:arrow_down:
pyairtable/api/retrying.py	`100.00% <100.00%> (ø)`

mesozoic commented 1 year ago

I added an integration test that does (more or less) what the batch above does; it spawns 25 threads and makes multiple calls to batch_create from each of them. Without the default retry strategy, the test fails. With it, the test passes in ~14 seconds. Better to be confident we won't regress this in the future.

gtalarico / pyairtable

Use a default retry strategy and remove `time.sleep` #272

Codecov Report