Implement configurable rate limiter

joepurdy commented 2 years ago

At Customer.io we have an informal rate limit for our public Track API set at 100 RPS. In reality we don't hard limit folks that exceed this limit by sending HTTP 429 responses because we want customers to be empowered to send us their real-time data no matter how high throughput "real-time" is for them.

However, the reason the informal rate limit exists is because a common anti-pattern is the "fire hose" integration where folks send us API requests at exceptionally high volume that are irrelevant to the data they need to send to Customer.io. In these cases a customer can inadvertently cause performance degradation on their account.

To help folks using the go-customerio library take better control over their integration we should add a configurable rate limiter that is disabled by default (the current behavior of send as fast as possible), but allows a user to throttle their requests if they have a use case where they'd like to reduce the risk of overwhelming their account.

jcobhams commented 2 years ago

Hey @joepurdy good work on the updates and changes to the library. Since it's the conversation of Rate Limiting, I think (IMO) that a bulk endpoint would be nice to have as well. A middle ground between rate limiting/throttling requests vs uploading a csv of data.

Even if the endpoint takes 1K items per call, that can cut down 10K individual requests to 10 requests. The response structure can be a map of some sort that reports on the status for each item in the batch. WDYT?

joepurdy commented 2 years ago

@jcobhams not a horrible idea and I'll add some context as to why we don't yet provide batch endpoints and what adding them might look like. For starters it's definitely a feature we've explored internally for our Track API since we're well aware of the utility folks would get out of a batch endpoint. We do have an internal proposal that explores v2 endpoints for track.customer.io including a /v2/batch endpoint.

One reason we've intentionally avoided adding a batch endpoint is because while our data collection API at track.customer.io could easily handle increased batched requests we see common scale issues for larger workspaces farther on in the data processing pipeline. We've been opinionated about ideal data ingestion rates to guide our customers away from unintentionally DDoSing their Customer.io workspaces.

Our Platform Squad is engaged in work to remove some of our common bottlenecks so I'll raise the idea of the batch endpoints again with the team to see where we're at with that. No promises on a timeline or where it falls in the broader roadmap, but you can be sure it's on our radar and as soon as we make the infrastructural changes to enable a batch endpoint we'll roll support into all our client libraries so folks can better integrate their bulk data updates!

customerio / go-customerio

Implement configurable rate limiter #36