OSMCha / osm-adiff-service

Generate full changeset representations from overpass augmented diffs + osm metadata
2 stars 0 forks source link

Check `timestamp_osm_base` instead of Augmented Diff status #2

Open nrenner opened 6 months ago

nrenner commented 6 months ago

Currently, the latest Augmented Diff status is used to check if the changes in the passed replication file are already in Overpass.

Overpass includes a timestamp_osm_base meta property in all query responses, which is the timestamp of latest imported edit. My suggestion would be to use that instead (from an empty query). It would be more straightforward and not depend on the Augmented Diff status file or API, and Augmented Diffs are no longer used here anyway.

Current

The maximum status id is calculated from all change timestamps in the replication file and then compared to the latest Augmented Diff status:

https://github.com/OSMCha/osm-adiff-service/blob/f7925fdc97e1e81ecbe639990fd99a34f98eaea7/lib/get-changesets.js#L22-L33

As I understand from the code, Overpass itself waits for the latest DB timestamp, before querying and writing the Augmented Diff and the newest state file. Without a status file, the augmented_diff_status API call would calculate the status from the DB timestamp.

Suggested

So why not shortcut this and compare the latest edit timestamp from the replication file with the Overpass DB timestamp_osm_base, representing the latest imported edit timestamp?

Using an empty query like this:

https://overpass.osmcha.org/api/interpreter?data=[out:json];out;

e.g. returns:

{
  "version": 0.6,
  "generator": "Overpass API 0.7.57.1 74a55df1",
  "osm3s": {
    "timestamp_osm_base": "2024-03-21T12:11:56Z",
    "copyright": "The data included in this document is from www.openstreetmap.org. The data is made available under ODbL."
  },
  "elements": [
  ]
}
nrenner commented 6 months ago

Some documentation about the timestamp_osm_base (slightly different name for XML here):

The meta tags contains in the attribute osm_base the entry date with a timestamp of the data. That means that all edits that have been uploaded before this date are included.

Output Formats