[FEATURE] Real-time OpenStreetMap updates?

msbarry commented 2 years ago

Is your feature request related to a problem? Please describe. Planetiler was initially designed to work in batch mode but it might be possible to get incremental minutely updates from OpenStreetMap.

Describe the solution you'd like Run planetiler once to get a base tileset plus some extra indices, then run a continuous (or cron) process to crawl the latest OSM replication diffs (https://wiki.openstreetmap.org/wiki/Planet.osm/diffs) and apply them to the tileset. The initial run could be slower than batch mode, but should still be reasonable (<12 hours) and incremental updates should be <1 minute. Ideally, incremental updates should need much less RAM than a full import.

Describe alternatives you've considered Start from a planet.osm.pbf dump, then loop continuously:

download and apply all diff files to bring the planet.osm.pbf up to date
then run planetiler to generate a new planet.mbtiles files
then swap out the file that you are serving from

This would result in a 1-3 hour latency, depending on how big of a machine you use.

Open Questions

Should the process update a local mbtiles file so each shard serving tiles would have an update process running? Or should it keep an external data store (like Postgres, or a k/v store) updated?
Is there an API that can provide info for all of the nodes/ways/relations that need to be re-rendered when an related element changes (I.e. a node moves so we need to re-render ways that contain it)? Otherwise we'll need to store a full mirror of all the relevant raw data from osm.

bdon commented 2 years ago

Is there an API that can provide info for all of the nodes/ways/relations that need to be re-rendered when an related element changes (I.e. a node moves so we need to re-render ways that contain it)?

I've attempted this and it is very noisy for any small node change that influences ways > relations > super-relations. A general purpose mirror of raw OSM data I think is a good approach and exists for Java already with libraries like osm-lib.

msbarry commented 2 years ago

The basic approach here would be:

keep a data store of the vector tile features rendered from each OSM element (currently we throw this away after generating the initial tileset)
on each change, find all affected ways, relations, etc. and wipe features rendered from them, keeping track of the tiles they were in
then insert features rendered from the new versions of the OSM element and elements that depend on it, keeping track of tiles they appear in
then re-render every tile that was touched in step 2 or 3

Steps 2 and 3 would basically need random access to every osm element before and after the change...

I've attempted this and it is very noisy for any small node change that influences ways > relations > super-relations.

I thought that might be the case but my hope was the tile rendering would be fast enough to compensate, not sure if that theory would hold though.

Thanks for the pointer on osm-lib. I also spoke with the baremaps maintainer - they load the nodes, ways, and relations into a postgres database and can render tiles on the fly from that. That approach may make more sense for applications that need real-time updates. We might try to come up with a common format to describe tileset generation that could be shared between the two projects so you could get static or real-time tiles from the same spec.

msbarry commented 2 years ago

Another option is to only store intermediate vector tile features and render the tiles themselves on the fly when requested. The main downside here is that some very complex tiles could take 5+ seconds to render, for example some of these ones in Jakarta on a 2021 M1 Macbook Pro:

{x=6526 y=4236 z=13} 7.3960145 seconds
{x=6527 y=4240 z=13} 7.675549541 seconds
{x=6527 y=4237 z=13} 7.67833925 seconds
{x=6527 y=4238 z=13} 8.20554475 seconds
{x=6527 y=4239 z=13} 9.172895042 seconds

grischard commented 2 years ago

Would it speed things up to use the existing planet.mbtiles as a cache for subsequent runs? Then instead of throwing cpu/io at the problem, it's cache invalidation you deal with, which is much more fun.

wipfli commented 2 years ago

@grischard can you share how you would approach the problem using the existing planet.mbtiles file?

grischard commented 2 years ago

I see that @msbarry's basic approach idea from 14 January is basically that, but better - patch the mbtiles with new tiles. I was suggesting creating a new mbtiles but fishing out tiles that haven't changed from the old one.

msbarry commented 2 years ago

That is an interesting approach, so it would be something like:

Run planetiler from scratch, store the osmosis replication timestamp
grab the change files from that timestamp to now
compute the tile ids that are affected by those changes
run planetiler again with the original osm + the change files, but skip rendering tiles not in the set of changed tile ids

?

That would take a bit longer for each update but eliminates some state we'd need to maintain between runs. I'd guess it might get us down in the ~15 minute range? It might be tricky to compute the set of changes tile ids from just the change files though (Ie a node moves, but it’s part of a way that spans many tiles)

ZeLonewolf commented 2 years ago

I'm wondering what optimizations might be possible if the .mbtiles was exploded into x/y/z pbfs (something that OWG is planning to do anyways). Perhaps the highest zooms could be run more frequently, similar to how it's done in osm-carto.

msbarry commented 2 years ago

@ZeLonewolf Are you saying they want to extract the tiles to files on disk?

We could try doing a planet generation with something like --minzoom=14. I think that would save quite a bit of io/cpu during tile generation.

ZeLonewolf commented 2 years ago

Yes exactly, extracted to disk and served statically. Turns out disk space is cheap.

msbarry commented 2 years ago

Ok, I had though about adding different output formats (pmtiles, files on disk) but didn't think anyone would want to deal with the 280 million files. If that's not the case we can include that as a native output format from planetiler.

onthegomap / planetiler

[FEATURE] Real-time OpenStreetMap updates? #47