deboorn / expbackoffworker

Adds automatic exponential backoff to Laravel 5's queue worker
Apache License 2.0
7 stars 3 forks source link

Spatie Backup & Exponential Backup #6

Open jasperf opened 4 years ago

jasperf commented 4 years ago

Detailed description

We are having issues with Laravel Backup by Spatie due to rate limiting at Digital Ocean Spaces. We get time-outs / 503s. Besides options PHP Fly League S3 Driver (used by laravel Backup) gives us to do better error reporting there is no exponential backoff built in. We chatted with Flysystem https://github.com/thephpleague/flysystem-aws-s3-v3/issues/205 and Laravel Spatie https://github.com/spatie/laravel-backup/issues/783 . Both agree exponential backoff is a good idea, but both packages do not provide this service.

So it would be great to either show in a test case how Laravel Backup could be used or an example in the README.md perhaps. This would help us a lot in our quest to get exponential backup to work.

Context

This change is vital to us to have Digital Ocean Spaces work for us for backups and potentially loading of static data. Currently backups only work partly and we had to give on on using their CDN for loading images of. Exponential Backoff would allows us to retry backups, cleanup of old backups after an x number of backups have been stored.

Possible implementation

Only idea of implementation I can think of currently is either an example of integration shown with a test or README.md example here or a new package that both uses Laravel Backup and Exponential Backoff. The former I think is better as both PHP Flysystem S3 nor Spatie Laravel Backup want to implement this directly as they want to focus on the core features of their respective packages.

Your environment

Include as many relevant details about the environment you experienced the bug in and how to reproduce it.

deboorn commented 4 years ago

Thanks for reaching out. I hope your week is going well! Looks like Digital Ocean's API returns rate limit information in the response header including "RateLimit-Remaining" (the number of requests that remain) and "RateLimit-Reset" (the time when the oldest request will expire). In this case, you would want to monitor the response headers of their API and to adjust your backups accordingly dynamically to honor the rate limit reset.

In this case, it sounds like you may want to both throttle and schedule your requests to Digital Ocean. An exponential backoff would be handy if you match min backoff delay to the min reset time for the API. However, this would assume your requests to their API are uniform in timing.

With all that being said, an exponential backoff alone wouldn't solve the issue. You would want to implement a task queue system that supports scheduled (delayed) tasks, throttling, and a backoff feature.

Fortunately, Google Cloud Tasks offer that exact feature with their Push Queues that operate over HTTP. You can learn more by visiting: https://cloud.google.com/tasks

jasperf commented 4 years ago

@deboorn Thank you so much for your detailed response. I had not looked into rate limit remaining and rate limit reset that much. Was more focused on the 200 requests per second limit. But I guess that would not be enough then as https://developers.digitalocean.com/documentation/v2/#rate-limit shows that:

General limits in API:

Requests through the API are rate limited per OAuth token. Current rate limits: 5,000 requests per hour 250 requests per minute (5% of the hourly total)

So like 83 per minute if divided equally. But 250 per minute is a lot less again than 200 requests per second:

Comment beginning 2018 on Spaces:

at the moment we are rate limiting individual Spaces (the 503 error) that are receiving more than 200 reqs/s. If you need higher throughput we would ask that you create multiple Spaces and split your objects among them.

https://www.digitalocean.com/community/questions/rate-limiting-on-spaces

But these could be related / connected.

Specifics:

RateLimit-Limit: The number of requests that can be made per hour. RateLimit-Remaining: The number of requests that remain before you hit your request limit. See the information below for how the request limits expire. RateLimit-Reset: This represents the time when the oldest request will expire. The value is given in Unix epoch time.

As long as the RateLimit-Remaining count is above zero, you will be able to make additional requests.

The way that a request expires and is removed from the current limit count is important to understand. Rather than counting all of the requests for an hour and resetting the RateLimit-Remaining value at the end of the hour, each request instead has its own timer.

.....

Keep this in mind if you see your RateLimit-Reset value change, but not move an entire hour into the future.

If the RateLimit-Remaining reaches zero, subsequent requests will receive a 429 error code until the request reset has been reached. You can see the format of the response in the examples.

Note: The following endpoints have special rate limit requirements that are independent of the limits defined above.

Only 12 POST requests to the /v2/floating_ips endpoint to create Floating IPs can be made per 60 seconds. Only 10 GET requests to the /v2/account/keys endpoint to list SSH keys can be made per 60 seconds.

Anyways, I am going to look into Google Tasks and the Digital Ocean API some more. Thanks again!