Monitor request rate (rate limiting)

judgej commented 5 years ago

The Xero API has a very strict rate limit on an application for each organisation. The basic limit is a maximum of 60 requests in a 60-second rolling minute. There is a daily limit too, but the shorter limit is the one that is hit most often. It is easy to hit this limit when pulling payment details for a large payment batch, which can include up to 200 payments in one go.

The PSR-18 client is the ideal place to monitor the requests on a rolling basis. It will then be able to signal to the application if the rate is nearing the application/organisation limit.

Thinking about how the limit approach could be signaled, one way could be to provide a time needed to wait before the next (or next N) requests can be sent. Far from the limit, the time would be zero. Approaching the limit the time would go up. The application can then decide whether it needs to sleep for a couple of seconds, or reschedule a job to finish off processing after some calculated delay. The aim here would be to make best/most efficient use of the permitted rate, while trying hard not to exceed the limits and break jobs unexpectedly.

A persistent cache would be needed to monitor the rolling-minute requests, since each new or rescheduled job would need to know how many requests it can issue in its time slot before it must pull back and reschedule or sleep.

Add rolling minute limits

Per Xero organisation

[ ] General API Minute Limit: Max 60 calls in a rolling 60 second window (per Xero org)
[ ] General API Daily Limit: Max 5000 calls in a rolling 24 hour window (per Xero org)

Across All of CreDec

[ ] Bank statement lines: Max 180 in a rolling 60 second window across the whole CreDec app (confirmed by James Coleman on CSI first demo, 26 Sept 2019)

judgej commented 5 years ago

This could be done by a decorator, so it need not be a core part of this package. The decorator would take a caching service for storing its request history, and could return results in the response headers, like the github API does with custom X-RateLimit-* headers.

judgej commented 5 years ago

An approach to handling the sliding window rate limiter is described and shown graphically in this excellent article here:

https://medium.com/figma-design/an-alternative-approach-to-rate-limiting-f8a06cf7c94c

bradydan commented 5 years ago

Xero API docs:

API Rate Limits

There are limits to the number of API calls that your application can make against a particular Xero organisation.
Minute Limit: 60 calls in a rolling 60 second window
Daily Limit: 5000 calls in a rolling 24 hour window
If you exceed either rate limit you will receive a HTTP 503 (Service Unavailable) response with the following message in the http response body:
oauth_problem=rate limit exceeded&oauth_problem_advice=please wait before retrying the xero api
You will also receive an X-Rate-Limit-Problem HTTP header with the value of Daily or Minute to indicate which rate limit you have hit.

If you encounter a limit, do not continue to make requests as this may continue to add to your limitation. The most common issue encountered is the 60 requests/min rate limit. If possible, queue requests and allow a few seconds before attempting to make another request.

Some things to take into account:

The limits are rolling windows
The limits are per auth, i.e. per Xero organisation (not per CCODE)
Limits will vary for different cloud accounting providers. Eg Quickbooks: "API calls are throttled at 500 requests per minute per realmId, including batch requests." source - so this limit seems way higher than Xero. p.s. I assume "realmId" is the equivalent of a Xero organisation ID.

judgej commented 4 years ago

Confirmed today from Xero that the rate is counted as a true rolling minute.

bradydan commented 4 years ago

Updated task to confirm Bank statement lines: Max 180 in a rolling 60 second window across the whole CreDec app

bradydan commented 4 years ago

There are some pre-existing libraries for rate limiting:

https://github.com/davedevelopment/stiphle

https://github.com/touhonoob/RateLimit

https://github.com/eddiejibson/limitrr-php (See comments on reddit for this one: https://reddit.com/r/PHP/comments/a8ds5x/limitrr_better_php_rate_limiting_library_version/)

judgej commented 4 years ago

All great and useful for their use-cases, and I have pulled out a few useful bits. I ended up writing my own https://github.com/consilience/api-rate-monitor

What we have isn't strictly rate limiting for the transactions, but rate monitoring with the ability to redispatch the job after a period until we can send another bunch of requests. Main features that distinguishes our solution are:

Wraps around a PSR-18 client.
Uses any PSR-6 cache, so easily plugs into any framework or simple script.
Implements rolling window rate limit. Could be expanded to support leaky bucket or other rules if needed.
Does not prevent any requests or sleeps by itself, it just warns the app when it is getting close to the limit, and can give the app the sleep time needed to refresh the rolling window completely, or how many requests can be performed in the next N seconds before the limit is reached.

Most of the solutions seem to be based around a process having a bunch of API requests to send one after the other in a simple loop. This will be the case when working through a queue of bank statements, so we can use simple throttling there. However, most of the processes are not simple loops - they are fetching pages, then individual records, then details, then some contacts, then some details here, optional details there etc. They are difficult to track as a simple loop, so this gives us more flexibility to define the throttling rules to make best use of that.

bradydan commented 4 years ago

A sublimely elegant approach :)

Glad that there was even something marginally useful in the other packages

judgej commented 4 years ago

I am wondering now if this approach as a PSR-18 client wrapper actually could sleep and slow down the requests if needed. All the requests go through it, and it knows exactly how close it is to blowing the limit at any point, and could just sleep for a second before each request if it is, say, within 10 requests of its limit for the rolling window. It might work as an option. A sleep and a shower thought session in the morning may shed some light on that approach.

consilience / xero-api-client

Monitor request rate (rate limiting) #11

Add rolling minute limits