drolbr / Overpass-API

A database engine to query the OpenStreetMap data.
http://overpass-api.de
GNU Affero General Public License v3.0
692 stars 90 forks source link

.osc files might have items in the wrong order (when converted from .osh with osmium-tool or osmconvert) #652

Open plepe opened 2 years ago

plepe commented 2 years ago

What software are you using?

What operating system version are you using?

What did you do exactly?

I have a .osh file which I convert to a .osc file (using osmium or osmconvert, which produced the same result) to initialize a Overpass API database with attic data. Apparently, Overpass API expects the items to appear chronologically, with ways after nodes. If I import the .osc file, Overpass API would complain about missing nodes. If I re-order the .osc file, the import is successful.

Actually, I do not know whether the bug is with Overpass API orosmium, so I created a bug report there too: osmcode/osmium-tool#241

These are the commands that I would use:

osmium cat bug.osh -o bug.osc
cat bug.osc | update_database --db-dir=test --keep-attic

Query an object:

echo '[out:json][date:"2011-01-01T00:00:00Z"];way(86127691);out meta geom;' | osm3s_query --db-dir=test

Result:

{
  "version": 0.6,
  "generator": "Overpass API 0.7.57.1 74a55df1",
  "osm3s": {
    "timestamp_osm_base": "",
    "copyright": "The data included in this document is from www.openstreetmap.org. The data is made available under ODbL."
  },
  "elements": [
{
  "type": "way",
  "id": 86127691,
  "timestamp": "2010-11-22T17:27:01Z",
  "version": 1,
  "changeset": 6432752,
  "user": "digitalhippie",
  "uid": 41463,
  "bounds": {
    "minlat": 100.0000000,
    "minlon": 200.0000000,
    "maxlat": 100.0000000,
    "maxlon": 200.0000000
  },
  "nodes": [
    999601342,
    999600632,
    999601129,
    999600521,
    999601342
  ],
  "geometry": [
  ],
  "tags": {
    "building": "yes",
    "source": "Yahoo"
  }
}
  ]
}

I modified the bug.osc and re-ordered/merged the create/modify/delete statements, see bug1.osc.

When I use this, I would get the following result:

{
  "version": 0.6,
  "generator": "Overpass API 0.7.57.1 74a55df1",
  "osm3s": {
    "timestamp_osm_base": "",
    "copyright": "The data included in this document is from www.openstreetmap.org. The data is made available under ODbL."
  },
  "elements": [
{
  "type": "way",
  "id": 86127691,
  "timestamp": "2010-11-22T17:27:01Z",
  "version": 1,
  "changeset": 6432752,
  "user": "digitalhippie",
  "uid": 41463,
  "bounds": {
    "minlat": 48.2028205,
    "minlon": 16.3402169,
    "maxlat": 48.2042181,
    "maxlon": 16.3416447
  },
  "nodes": [
    999601342,
    999600632,
    999601129,
    999600521,
    999601342
  ],
  "geometry": [
    { "lat": 48.2042150, "lon": 16.3402169 },
    { "lat": 48.2028205, "lon": 16.3404620 },
    { "lat": 48.2028268, "lon": 16.3416447 },
    { "lat": 48.2042181, "lon": 16.3415269 },
    { "lat": 48.2042150, "lon": 16.3402169 }
  ],
  "tags": {
    "source": "Yahoo",
    "building": "yes"
  }
}
  ]
}

Find all files in the following in this archive: bug.zip

mmd-osm commented 2 years ago

The data you load into Overpass API needs to strictly follow the structure of planet or minutely diff files, meaning nodes, followed by ways, and finally relations, i.e. file bug.osc looks ok from a structure point of view. However, there are still some issues with timestamps and deleted nodes, which are being referenced by ways in the same file.

I have to say that the official osmium tool never worked for me to load an Overpass database with attic data. This isn't really a bug with osmium tool, it simply doesn't support this special use case.

I created an unpublished patched version, which would split the bug.osh file into multiple files based on time slices, and load one slice at a time. Unfortunately, I don't seem to have the code available anymore...

At least you should start with a "planet like" osm xml file, where all objects are present in a single version only (!). Then continue applying .osc files, which mimic minutely diffs.

Also, attic only really works on a planet scale database. I never managed to get an attic db up based on full history extracts only.

plepe commented 2 years ago

In the last two or three hours, I hacked together a little script, which would convert a .osh file to a valid .osc file for insertion into Overpass API: https://github.com/plepe/osh2osc

I haven't tested larger datasets yet, but my little tests were successful. I hope, somebody finds this useful.

mmd-osm commented 2 years ago

One more thing i forgot: full history planet typically don't include redacted object versions. If you're unlucky and your extract depends on such redacted objects, Overpass will usually crash at some point during updates.

Also, using a geofabrik extract and applying daily updates in attic mode never worked here...

In general, you're entering uncharted territory and if some things aren't working, that's pretty much how it is.

plepe commented 2 years ago

I'm happy with a final version which does not need further updates from daily diffs (and from time to time re-import from Geofabrik extracts). That's fine.

By the way, I noticed that my script didn't really worked. Apparently, it's necessary to split the change-file into separate files and import them one-by-one. So, I updated the code to write files into a directory instead.