mvexel / overpass-api-python-wrapper

Python bindings for the OpenStreetMap Overpass API
Apache License 2.0
368 stars 90 forks source link

Avoid returncode 429 (MultipleRequestsError) by passing customizable user-agent and from fields #88

Closed s-m-e closed 2 years ago

s-m-e commented 6 years ago

I just ran into an issue along the lines of the one described here. The Overpass API would suddenly return 429 where it had previously just worked. As it turns out, this may be due to a (temporary) restriction, which can be avoided by sending custom user-agent and from headers.

I solved it by modifying the following bit of code in overpass like this:

        try:
            r = requests.post(
                self.endpoint,
                data=payload,
                timeout=self.timeout,
                headers={
                    'Accept-Charset': 'utf-8;q=0.7,*;q=0.7',
                    'From': 'somebody@website.xyz',
                    'Referer': 'http://www.website.xyz/',
                    'User-Agent': 'overpass-api-python-wrapper (Linux x86_64)'
                    }
            )

I'd love it if those fields were customizable through the API class constructor.

mvexel commented 6 years ago

Good idea -- are you in a position to help with a PR? Or do you know someone who can help?

s-m-e commented 6 years ago

@mvexel I can - if the above design is ok (?). I'd add a few properties to the API class, filled by optional key word arguments and otherwise initialized with some default values (in the __init__ routine). Speaking of default values, any preferences?

mmd-osm commented 6 years ago

HTTP 429 means that you're hitting the rate limiter by sending too many and/or too expensive queries. Check /api/status to find out how many seconds you need to wait before sending the next request.

Obviously, putting in a suitable User Agent is always a plus, but won't affect the overall HTTP 429 logic in any way.

BTW: Issue was reported here (#38) before, but closed again as noone was picking it up for quite a while. Great, that there's some follow up now!

s-m-e commented 6 years ago

@mmd-osm Hmm have a look at this [EDIT: I tried to link to the answer - does not work, you must scroll down manually]. My understanding is that sometimes temporarily requests without referrer and/or from field are blocked from entire IP address ranges. So you can get a 429 without having done something wrong (?).

mmd-osm commented 6 years ago

Did you encounter HTTP 429 for the very first query already, or only after sending several queries? Did you also try the same query in overpass turbo?

Usually, measures like described on the Help page are only taken in extreme cases where single users abuse the service by flooding it with requests. Then temporary could mean several days, maybe weeks (don't remember if the Python blocking is still in place right now).

s-m-e commented 6 years ago

@mmd-osm I have a piece of code which does fairly small and inexpensive queries. I use it every now and then (once a month or so) to get the latest data from OSM. It always worked, until this one day when it did not - right from the start. Resetting my DSL connection, i.e. obtaining a new IP address, did not change anything (the new address was from the same block as the old). Adding the above fix on the other hand immediately gave me access again.

mvexel commented 6 years ago

Perhaps we need to support the API status request natively. I think it returns the time you need to wait to resume requests from your IP. Users can then space out their requests in their own applications. How does that sound?

mmd-osm commented 6 years ago

I've created https://github.com/drolbr/Overpass-API/issues/351 to get rid of the /api/status call (it's best suited for debug purposes), and include the relevant information in the http response. Unfortunately, this more standards like approach is not yet available.

mvexel commented 2 years ago

With comment from @mmd-osm, closing this issue for now. Feel free to re-open if you see a good path forward with this.