drolbr / Overpass-API

A database engine to query the OpenStreetMap data.
http://overpass-api.de
GNU Affero General Public License v3.0
693 stars 90 forks source link

Maximum output size is around 1,7 GB? #601

Closed ghost closed 3 years ago

ghost commented 3 years ago

I host a own instance of the overpass api. When runing a query to get a json of a larger area, the json creation stops usually at 1,7GB. Timeout is maybe not an issue because the data transfer takes 16 seconds at 120MB/s. This applies to json and xml output.

mmd-osm commented 3 years ago

What do you mean by "it stops"? Do you see an error message?

ghost commented 3 years ago

Let me describe my workflow.

  1. Installing overpass-api with Apache2 -> Running dispatcher
  2. Populating the DB with germany-latest.bz2
  3. Send a query to the http-API 3.1. Script (sorry for bad layout) ` `

3.2. API Call wget --post-file=queryfile http://localhost/api/interpreter --output-document=queryresult.json

Result wget is loading a JSON file with 120 MB/s for around 16 to 20 seconds. Then the download stops with the message "succeed". The JSON file is something 1,7 GB.

First I thought, the output should be 1,7GB, but when populating the DB with the planet dump, the output was the same size.

mmd-osm commented 3 years ago

Can you try your query using standalone osm3s_query tool on the command line, and see if it works? If that's ok, you need to troubleshoot your Apache config.

ghost commented 3 years ago

Is it possible to store the output from osm3s_query?

But I'll check.

mmd-osm commented 3 years ago

Yes, sure, you can run osm3s_query < myquery > myquery.result

ghost commented 3 years ago

This is my workflow now.

  1. Populating the DB with germany osm data

  2. Using this query as "routegermanynobbox" <osm-script output="json" output-config="" timeout="999"><union into="_"><query into="_" type="node"><has-kv k="route" modv="" v="bus"/></query><query into="_" type="way"><has-kv k="route" modv="" v="bus"/></query><query into="_" type="relation"><has-kv k="route" modv="" v="bus"/></query> </union><print e="" from="_" geometry="skeleton" ids="yes" limit="" mode="body" n="" order="id" s="" w=""/><recurse from="_" into="_" type="down"/><print e="" from="_" geometry="skeleton" ids="yes" limit="" mode="body" n="" order="quadtile" s="" w=""/></osm-script>

  3. Start the tool /bin/osm/bin/osm3s_query <routegermanynobbox> routegermanynobbox.json

Output:

encoding remark: Please enter your query and terminate it with CTRL+D.
encoding remark: Your input contains an 'osm-script' tag. Thus, a line with the
datatype declaration is added. This shifts line numbering by -1 line(s).
After 0h0m36s: in "print", part 0, on line 16. Stack: 0 of 0

du routequermanynobbox.json 1349352 routeqermanynobbox.json

This file is even smaller than before. I'll test with planet dump.

mmd-osm commented 3 years ago

So you want to extract all 198247+ bus routes globally? I think it doesn't make sense to use Overpass API for this purpose, better look at osmium or similar tools for planet scale extraction of data.

ghost commented 3 years ago

True. Overpass is able to get the geometries in relation, which makes it super easy. It was worth trying.