gravitystorm / openstreetmap-carto

A general-purpose OpenStreetMap mapnik style, in CartoCSS
Other
1.53k stars 817 forks source link

Move to osm2pgsql flex backend #4112

Open pnorman opened 4 years ago

pnorman commented 4 years ago

osm2pgsql has a new backend, the flex backend. This is a successor to the multi backend, and allows the style-supplied transformation to configure much more, including defining what tables to create.

Importantly for us, it allows callbacks for us to include information from relations on the geometries created from ways. This will allow us to unblock #596 and has the potential to let us deduplicate admin geometries at import time.

Besides offering features we need, developing transforms for it will help osm2pgsql uncover any issues in the backend and help with adoption of its new features.

As part of this, I'd like to make a few changes

This will leave us with with five tables

Table name Contents
planet_osm_point Current contents
planet_osm_polygon Current contents including admin areas as polygons without highway areas
planet_osm_line Current contents without roads, railways, or admin lines from relations
planet_osm_transport_line Roads and rail with information from parent route relations and z_order
planet_osm_transport_polygon Roads and rail polygons with z_order

This will allow us to

This is a long-running project, but because we can reuse our existing Lua code there's less work than the initial move to Lua transforms. I think we're also better at project management now. There are a few milestones, and some tasks external to osm-carto.

Until we take the step of removing features from existing tables and changing the SQL, it's easy to maintain the fork. Once we take that step we start incurring cost from having to port changes from one branch to the other.

cc @joto @gravitystorm @lonvia

joto commented 4 years ago

As author of the flex backend code I welcome everybody who wants to try it out. Just be aware that it is still marked experimental and is not in a released osm2pgsql version yet. There are still some rough edges to smooth out. That being said, it would, of course, help to have more eyes on this to find any problems and move the new backend forward and get it released. Being able to support the needs of openstreetmap-carto is definitely very high on our list of things to do.

I would recommend to not use the stage 2 processing (for the relation tags on ways) at first, but try the flex backend without it. We don't know yet what the performance implications of the two-stage processing are in a real-world scenario and how solid the updating mechanism is, so just be careful with that. But even without that, the flex backend has many advantages.

mboeringa commented 4 years ago

planet_osm_line Current contents without roads, railways, or admin lines from relations

@pnorman , for backwards compatibility with other styles and other uses of the current OpenStreetMap database as imported with the 'openstreetmap-carto' style, I strongly recommend to not remove roads, railway, or admin lines from the planet_osm_line table, but just leave the contents of the existing planet_osm_polygon/line/point tables as it is currently.

This allows current users that rely on a minimal three table setup of the tables (polygon/line/point), to continue doing that. Currently, the planet_osm_line table has all the information you need to also render roads, and you can thus ignore the planet_osm_road table and do your own selection of "roads" from it through SQL. Changing the contents of planet_osm_line to exclude roads and railways will block that option.

I know this means you can't optimize the planet_osm_line table, but I do think this is better for downstream users relying on a basic three table setup.

gravitystorm commented 4 years ago

I would recommend to not use the stage 2 processing (for the relation tags on ways) at first, but try the flex backend without it. We don't know yet what the performance implications of the two-stage processing are in a real-world scenario and how solid the updating mechanism is, so just be careful with that.

I think the only thing I have to add here is that the logic required for advanced relation processing in a flex lua file becomes quite complex quite quickly, and fairly impenetrable for someone like me who doesn't really know how osm2pgsql works internally. For example, knowing what features to mark, what should be done in which stage, etc. Then there's the challenge of ensuring that the lua code also processes updates correctly, again with things like marking the right features and re-processing them. While re-running an import of an extract is straightforward, checking updates is not.

I've found that I need to treat the flex lua file as code, rather than configuration, and I've ended up writing rspec integration tests to check that my lua code does the right thing for imports and updates (i.e. by creating mock xml and osc files, running the osm2pgsql binary, and then inspecting contents of the database, for each test). I hope this will be unnecessary in future, as more of the processing complexity is made available in the included helper functions (like grab_tag, which came out of my lua work and avoids some pass-by-reference pitfalls).

I can do things with flex that used to require having a custom fork of osm2pgsql, and I've got some stuff in development that wasn't possible even with that. But so far having this test framework has been useful both while developing my lua code, and also for retesting as osm2pgsql gets updated.

pnorman commented 4 years ago

I've created a new long-running branch that currently reimplements existing behavior with the flex backend.

I'm treating it as code and have fairly comprehensive unit tests. These don't require running osm2pgsql, but test the lua code, mocking out the osm2pgsql supplied objects and functions.

pnorman commented 4 years ago

@pnorman , for backwards compatibility with other styles and other uses of the current OpenStreetMap database as imported with the 'openstreetmap-carto' style, I strongly recommend to not remove roads, railway, or admin lines from the planet_osm_line table, but just leave the contents of the existing planet_osm_polygon/line/point tables as it is currently.

This allows current users that rely on a minimal three table setup of the tables (polygon/line/point), to continue doing that. Currently, the planet_osm_line table has all the information you need to also render roads, and you can thus ignore the planet_osm_road table and do your own selection of "roads" from it through SQL. Changing the contents of planet_osm_line to exclude roads and railways will block that option.

I know this means you can't optimize the planet_osm_line table, but I do think this is better for downstream users relying on a basic three table setup.

We're going to be breaking backwards compatibility regardless with the changes to the columns in those tables. Is there a use-case for keeping roads in planet_osm_line without columns that other styles depend on?

mboeringa commented 4 years ago

Is there a use-case for keeping roads in planet_osm_line without columns that other styles depend on?

That depends. If you intend to remove the tags hstore column as well from planet_osm_line, than yes, there is not much sense in maintaining the roads in planet_osm_line, because there will be no way to restore the old behavior. On the other hand, if the tags column is still there, any missing columns could relatively easily be extracted from hstore by changing SQL.

Maybe it is better to ask for feedback about this on the mailing list or forums though.

I've created a [new long-running branch]([https://github.com/gravitystorm/openstreetmap-carto/tree/flex/master]](https://github.com/gravitystorm/openstreetmap-carto/tree/flex/master%5D) that currently reimplements existing behavior with the flex backend.

Thanks, that will definitely be of use to some, and likely makes the "backwards compatibility" issue also referred to above, largely a non-issue.

Will there also soon be a branch showing off some of the proposed changes in table structure as in your first post in this thread (https://github.com/gravitystorm/openstreetmap-carto/issues/4112#issue-595551934)? Would be interesting to see.

pnorman commented 4 years ago
  • Bring parent route relation information onto linear transportation elements (roads and rail) from ways

I gave this a think, and there are a few problems with this approach that I solved by instead making a planet_osm_route table that contains route membership information. This has a few advantages.

mboeringa commented 4 years ago

@pnorman , I am actually not entirely sure where it is best to post this, as I am not sure if this is a style issue, or an osm2pgsql flex backend issue, but I noticed the "network" column is not set in the planet_osm_route table, even though the "tags" column clearly shows it is available? This was based on a git pull of the latest openstreetmap-carto flex backend style and also osm2pgsql.

I have looked through the LUA style code trying to understand where it might go wrong, but it is unclear to me where the issue is.

afbeelding

jeisenbe commented 3 years ago

I note that the flex backend was released in 1.3.0 of osm2pgsql a few months ago, and recently updated in release 1.4.0: https://github.com/openstreetmap/osm2pgsql/releases/tag/1.4.0 - and there is now extensive documentation at https://osm2pgsql.org/

mboeringa commented 3 years ago

@jeisenbe , I have just released an updated version of the flex script, that adds some nice new options, and additionally fixing a blocking issue due to the use of an outdated 'multi' option in the existing lua file. This also fixes the issue of the 'network' column not being set in the new 'route' table that is part of this flex version, as I noted in https://github.com/gravitystorm/openstreetmap-carto/issues/4112#issuecomment-633057370.

See here: https://github.com/gravitystorm/openstreetmap-carto/pull/4320

pnorman commented 3 years ago

I have the admin stuff half done locally. I've written the code to do it, but need to finish writing the test cases.

pnorman commented 3 years ago

I have the admin stuff half done locally. I've written the code to do it, but need to finish writing the test cases.

This is done and pushed.

joto commented 3 months ago

I am restarting this effort to switch to osm2pgsql flex. It is sufficiently different from what was discussed in this issue that I created a new isse, see #4977 for all the details.