replicate / zoo

🦓 Zoo — Image Playground
https://zoo.replicate.dev
Apache License 2.0
319 stars 110 forks source link

Add rate limiting for API endpoints #65

Closed aron closed 9 months ago

aron commented 9 months ago

This adds some basic rate limiting in production for the /api/predictions/[id] endpoint. We allow a sliding window of 20 requests within 10 seconds to fetch prediction metadata.

If rate limits are hit the client will wait for 10 seconds for the window to ease up before retrying the request. This may have the result of the query experience being slower but we need to keep requests reasonable until we can move to a client side implementation.

We also attempt to reduce the load by introducing a backoff for the pings. Instead of pinging every 500ms indefinitely we now slowly increase the time between requests. Looking at the distribution of traffic. Most requests take between 1 & 3 seconds so this shouldn't have a noticable effect. But for longer predictions >10 seconds we'll greatly reduce the number of requests made.

vercel[bot] commented 9 months ago

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
zoo ✅ Ready (Inspect) Visit Preview 💬 Add feedback Dec 8, 2023 1:20pm
cbh123 commented 9 months ago

Gave it a quick try in the deployment preview and looks good! thank you!