Rate Limiting for Overpass-API Requests

maproulette / maproulette3

MapRoulette, the micro-tasking tool for OpenStreetMap

https://maproulette.org

MIT License

120 stars 32 forks source link

Rate Limiting for Overpass-API Requests #605

Open Noki opened 5 years ago

Noki commented 5 years ago

The Overpass-API enforces rate-limits. When rebuilding challenges this often results in a "Too Many Requests" error message:

Too Many Requests:<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
  <meta http-equiv="content-type" content="text/html; charset=utf-8" lang="en"/>
  <title>OSM3S Response</title>
</head>
<body>

<p>The data included in this document is from www.openstreetmap.org. The data is made available under ODbL.</p>
<p><strong style="color:#FF0000">Error</strong>: runtime error: open64: 0 Success /osm3s_v0.7.55_osm_base Dispatcher_Client::request_read_and_idx::rate_limited. Please check /api/status for the quota of your IP address. </p>

</body>
</html>

Maproulette should have a queue for requests to the overpass api and work around the rate limits.

drolbr commented 5 years ago

I'm open to figure out how to get Maproulette's use cases running more smoothly.

I need to understand what access pattern you use: Is it many request in a very short time (100 in 10 seconds or so) or rather many long running, consecutive requests (10 each of which is running 1 minute)? The more precise I understand your pattern the better I can fit changes to the quota algorithm.

Unfortunately, it is not possible to just drop all rate limits. While most users are diligent, the few that aren't would otherwise consume the vast majority of resources.

The current algorithm works as follows:

requests from the same IPv4 address or the same upper 64 bit of an IPv6 address are considered to come from the same user
each user is given a share of the server time that is inverse proportionate to the number of currently active users. E.g. if there are 10 users concurrently querying while the server (8 cores) is configured to handle up to 16 users concurrently then you can obtain most 6/16 of one server thread (per slot, see below). In particular, the quota varies automatically to account for changing loads
there are at the moment 2 slots per IP address. I.e. you can immediately start two requests or run one request every 12/16 of the time in the example above.

I suggest to document the usage pattern here. Then I suggest to move over to the Overpass API repo to ensure that the relevant parts are easy to find on both sides.

Noki commented 5 years ago

Hi Roland,

I think what the overpass API is doing in general is fine, the rate limiting has to be done within marroulette. The only thing I could think of (on the side of the overpass API) is that it could give better machine readable status information that would allow to handle errors smarter and give the status information faster.

I noticed that when you hit the rate limit it takes a lot of time to get the 429 Status code. I don't know if you delay the response on purpose or if you do some work in the background by accident. I would expect that an API would send a 429 response without any delay and without doing any work in the background. In addition you could also send a Retry-After HTTP-Header to indicate how long the client should wait before doing the next request.

In addition I think you could make use of other status codes to indicate problems with the api: https://www.ietf.org/assignments/http-status-codes/http-status-codes.xml

A 408 or 504 instead of a timeout with a 200 would be a better match. A 503 could be used during maintenance also with a Retry-After HTTP-Header.

And so on... I think you get the idea.

Best regards Tobias

drolbr commented 5 years ago

Thank you for the feedback. There are two different issues involved, on top of the already mentioned rate limits.

One thing is the timeout inside a response that starts with HTTP 200. This is inevitable once the server has started to send data. The rationale to send data is to speed up requests like large /map calls - the user can already begin to receive the data before the server has completed the request. However, I am happy to look into a specific example if this behaviour is undesirable for a certain type of requests.

The other thing is the delay of the HTTP 429. I observe the following to access patterns on the server:

People using requests from a slippy map with some panning around. From the server perspective this is a burst of requests and then comparably long silence. This is a reasonable use case that does allow for many users.
Greedy scripts, in particular starting the next request once the previous has completed or failed, sending requests that are obviously pointless from a human perspective (like sequentially requesting all ways by their id, or hundreds of times the same bounding box). If the server returns immediately in such a case then the script sends immediately a follow-up request. The server addresses this by keeping a request for up to 15 seconds on hold if it is busy or the quota is exceeded. The slippy map use cases usually get successfully serialized by the server and keep within the 15 seconds, and the stupid scripts do not consume too much disk and CPU load of the server.

Like with the other rate limits I am open to search for an improved algorithm. Ideas are e.g. to allow for a fail-fast flag on the request or to dismiss requests that have no chance to start within 15 seconds, but other ideas are welcome.

mvexel commented 5 years ago

Some ways to mitigate / solve this issue I see are:

Responding gracefully to non 200 responses from Overpass API ('Please try again later', or something else appropriate if the request timed out or ran into memory constraints on the Overpass side)
Disallow rebuilding existing challenges within a day (or whatever other time delta makes sense)
Encouraging users to test their queries in Overpass Turbo (for example through a link 'Test this query in Overpass Turbo') combined with better documentation of the expected response.
An actual Overpass request queue. This is perhaps less trivial than it sounds, because the MR interface is currently not built around deferred requests.

These are by no means the only or the complete solution, but I would like to arrive at something that is appropriate to the size of the problem.

Noki commented 5 years ago

@drolbr I would usually do the rate limiting with a load balancer in front of the backend services. Even a high number of requests are usually not a problem for load balancers and if you want people to write better code it is better to have them running into your rate limit right away instead of doing the limiting for them by delaying the requests. By using a rate limiting load balancer you could even rate-limit and balance accross multiple backend servers.

I know you are currently using Apache, but you should have a look at this article covering rate limiting in nginx. In your case I would probably just set this up, use 429 status codes instead of the 503 (default) and balance accross all overpass-api backend servers.

Regarding starting a 200 response even when it might result in a timeout: I would only do this for data-formats that allow reading the data in chunks (e.g. a stream of messages encoded in json) and than I would encode the timeout message in the same way.

mvexel commented 5 years ago

@Noki the Overpass-specific aspects of this are probably better handled in a ticket on the Overpass repo. Perhaps one of these already addresses parts of what you are suggesting?

mmd-osm commented 5 years ago

I would expect that an API would send a 429 response without any delay and without doing any work in the background. In addition you could also send a Retry-After HTTP-Header to indicate how long the client should wait before doing the next request.

That's exactly what https://github.com/drolbr/Overpass-API/issues/351 is about. Please follow up this part of the discussion in that issue.

I would usually do the rate limiting with a load balancer in front of the backend services. Even a high number of requests are usually not a problem for load balancers and if you want people to write better code it is better to have them running into your rate limit right away instead of doing the limiting for them by delaying the requests.

That doesn't really match today's architecture, where load is distributed using DNS round robin, i.e. there's no dedicated load balancer at all on Overpass side.

Regarding starting a 200 response even when it might result in a timeout: I would only do this for data-formats that allow reading the data in chunks (e.g. a stream of messages encoded in json) and than I would encode the timeout message in the same way.

Yeah, we've discussed this at length in the past already. It would mean that you would have to buffer the result locally before sending to the client to be sure there's no timeout. That's just too expensive, and it would delay delivery of the data to the client quite a bit. I don't think changing this behavior is in any way feasible atm.

mvexel commented 1 month ago

We should look into all current open Overpass related issues together sometime. @ljdelight @jschwarz2030