wiktorn / Overpass-API

Overpass API docker image
MIT License
133 stars 47 forks source link

Handle (or not) non-minute diffs #108

Open oesteban-vx opened 9 months ago

oesteban-vx commented 9 months ago

Summary: hour, ... (anything but minute) diffs won't work because base clone and diff repositories don't supoort it.

I have created a container using OVERPASS_DIFF_URL=https://planet.openstreetmap.org/replication/hour/.

Debugging the issue, I've reached the file /app/bin/download_clone.sh (inside the docker image). The entry-point file calls http://dev.overpass-api.de/api_drolbr/trigger_clone and puts the result into this file

$ cat /db/db/base-url
https://dev.overpass-api.de/clone//2023-10-28

Let's ignore the duplicated /. This is the initial data set we'll download. It includes a replicate_id, which is the latest changeset the data contains; we'll get diffs from that number to update the dataset. The problem is that the replicate_id corresponds to a minute diff, so we won't be able to find it in OVERPASS_DIFF_URL.

I guess those repositories (cloned dataset, diffs) are created by the source server, so there's nothing we can really do. Well, I wonder if we can get the timestamp of the minute diff, then look its timestamp in https://planet.openstreetmap.org/replication/minute/<replicate_id> and, once we have it, look for the corresponding hour replicate id. We'll probably get a diff not exactly for that timestamp, though.

At least, once the container has been created, write some message explaining that updates won't work (and don't even try).

wiktorn commented 9 months ago

I think this is possibile, maybe using similar logic that pyosmium-get-changes uses, to match the dump to the replication source.

Feel free to create a PR with that.