drolbr / Overpass-API

A database engine to query the OpenStreetMap data.
http://overpass-api.de
GNU Affero General Public License v3.0
692 stars 90 forks source link

Overpass seems to return inconsisten data when querying adiff's #671

Closed rene78 closed 1 year ago

rene78 commented 1 year ago

The query below with a rather large bounding box only returns node information.

overpass-api.de/api/interpreter?data=[adiff:"2022-07-10T10:41:26.850Z"][bbox:47.33382034035365,8.420677185058594,47.42622912485741,8.642120361328125][out:xml][timeout:22];way->.ways;(.ways>;node;);out meta;.ways out geom meta;

This query with a smaller bounding box returns node and way information. This is what is expected. overpass-api.de/api/interpreter?data=[adiff:"2022-07-10T10:44:55.570Z"][bbox:47.38170133628693,8.55087161064148,47.38747638741845,8.564711809158325][out:xml][timeout:22];way->.ways;(.ways>;node;);out meta;.ways out geom meta;

Is the request too extensive so that Overpass only returns nodes?

rene78 commented 1 year ago

Some more details can be found in this post: https://github.com/tyrasd/latest-changes/issues/17#issuecomment-1186485428

mmd-osm commented 1 year ago

I think your timeout value is too small, so that only the results of the first "out meta;" statements are returned. You should see the following error message towards the end of XML message: " runtime error: Query timed out in "print" at line 1 after 25 seconds. ".

That's really all working as expected. If the query times out somewhere in the middle of processing, you only get parts of the result. Note that a single "out" statement will always be processed completely.

rene78 commented 1 year ago

@mmd-osm, thank you very much! Didn't see the error at the end of the xml. Then I will simply increase the timeout a bit. Any chance to combine the output of meta and .ways? Read in the docs but couldn't find anything.

drolbr commented 1 year ago

I suggest to use a different query, for the example overpass-api.de/api/interpreter?data=[adiff:"2022-07-10T10:41:26.850Z"][bbox:47.33382034035365,8.420677185058594,47.42622912485741,8.642120361328125][out:xml];nw;out geom meta;. This returns the same information and has only one output statement.

The timeout is of limited interest here, as I cannot guarantee here any response time, but should be in sync with whatever timeout you use in the application. The .ways>; part is redundant, because nodes are either changed in their own right and subsequently found by node; or unchanged and then eliminated by the semantics of adiff.

rene78 commented 1 year ago

That looks very promising, @drolbr!

However there seems to be a tiny mismatch between the 2 queries (at least for the example that I tried). Old query: https://overpass-turbo.eu/s/1ker New query: https://overpass-turbo.eu/s/1kes

In your proposed query the node with id=264110987 somehow is not downloaded. Doesn't sound like a big deal but unfortunately it would mess up the display of the changesets in "latest-changes", since those points are needed to get certain meta data.

Any idea why this node is not downloaded in your proposed query?

Old query (correct): image

New query (slightly incorrect): newQuery

drolbr commented 1 year ago

Any idea why this node is not downloaded in your proposed query?

Node 264110987 is neither on the old nor on the new coordinates within the given bounding box [bbox:47.38170133628693,8.55087161064148,47.38747638741845,8.564711809158325].

Nonetheless, one gets this node as part of the way 1055956690. In fact, this is way is only included because this node has slightly moved and the other, unchanged, part of the way is within the bounding box.

rene78 commented 1 year ago

Thank you very much, @drolbr! I'll compare the 2 queries some more. If the advantages prevail I am going to replace the Overpass query with your proposal.