Add rate limiting for API endpoints

aron commented 9 months ago

This adds some basic rate limiting in production for the /api/predictions/[id] endpoint. We allow a sliding window of 20 requests within 10 seconds to fetch prediction metadata.

If rate limits are hit the client will wait for 10 seconds for the window to ease up before retrying the request. This may have the result of the query experience being slower but we need to keep requests reasonable until we can move to a client side implementation.

We also attempt to reduce the load by introducing a backoff for the pings. Instead of pinging every 500ms indefinitely we now slowly increase the time between requests. Looking at the distribution of traffic. Most requests take between 1 & 3 seconds so this shouldn't have a noticable effect. But for longer predictions >10 seconds we'll greatly reduce the number of requests made.

vercel[bot] commented 9 months ago

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Comments	Updated (UTC)
zoo	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	Dec 8, 2023 1:20pm

cbh123 commented 9 months ago

Gave it a quick try in the deployment preview and looks good! thank you!

replicate / zoo

Add rate limiting for API endpoints #65