drolbr / Overpass-API

A database engine to query the OpenStreetMap data.
http://overpass-api.de
GNU Affero General Public License v3.0
692 stars 90 forks source link

Restrict maximum timeout and maxsize a user can demand #639

Open lonvia opened 2 years ago

lonvia commented 2 years ago

The --time and --space parameters for the dispatcher can to my understanding only restrict the global resources that overpass uses. The problem with them is that they can't do much to help with fair sharing of resources. For my current use case (https://github.com/openstreetmap/operations/issues/569) I have a relatively large server that is supposed to serve many tiny queries and I'd like to keep the long running ones out.

It would be nice, if it was possible to strict on the server-side the timeout/maxsize that a user can request in the query. If a user requests a larger timeout/maxsize, it should get the usual out-of-resources error response.

mmd-osm commented 2 years ago

It is sometimes difficult to predict the exact runtime of a query. If you're providing a rather large timeout value, it might still happen that the query finishes much quicker.

I think a more backwards compatible approach would be to define optional maximum timeout/maxsize values per user, and if the user happens to set a timeout value of 100'000 s, we could automatically replace that value by the maximum permitted value.

For this purpose, I have defined two new environment variables in my fork, and use a logic along the lines of https://github.com/mmd-osm/Overpass-API/commit/1a0da05851fdd23959d37372c74d9f57d29a1e8d

OVERPASS_MAX_TIMEOUT: global override for maximium permitted [timeout:...] value. OVERPASS_MAX_ELEMENT_LIMIT: global override for maximum permitted [maxsize:...] value.

lonvia commented 2 years ago

I think it is better to be explicit with the user here. If the server does not permit a larger timeout, then there is no point in requesting one. You might as well change your query to set the largest permissible timeout.

I admit that I have a different motivation for returning an error rather than just using the smaller timeout. In my case I want to set a very small timeout which won't work for most of the more complex query. And in this case I don't even want the complex queries to be started (and then inevitably shut down). I'd rather see the user be told: you can't run that query here.

mmd-osm commented 2 years ago

One thing to keep in mind is that many queries (maybe around 80%) don't provide an explicit timeout or maxsize value and would fall back to the hard-coded default values (that's 180s for timeout and 512M for element limit). In particular the maxsize setting is fairly unknown and used in <1% of queries only.

Although this is a somewhat different requirement for which I'm using dedicated parameters (OVERPASS_DEFAULT_TIMEOUT, and OVERPASS_DEFAULT_ELEMENT_LIMIT), it should be included in this discussion on user limits.

I don't think it's a good idea to reject queries without explicit limits, so we also need to think about this case.

drolbr commented 2 years ago

The requests that lonvia has in mind can be configured such that there is always a timeout. This is not the problem here. It is rather that such a scheme is a change of both the architecture and the observable behaviour: there must be a place to set that value, and it is probably not on compile time. This means the value must be communicated to the clients. In the end it will be a dispatcher setting, environment variables, or similar. Thus I deem this is rather appropriate for a major version, i.e. 0.7.58.

For the moment being I suggest to set maxtime to a low value in the dispatcher. If it is set to less than 360 then no query with default timeout can run and yet still up to 22 request with a timeout of at most 15 seconds can run in parallel.