openmaptiles / openmaptiles-tools

Tools to turn the schema into other formats
MIT License
397 stars 136 forks source link

Add support for extendable layer schema #377

Open nyurik opened 3 years ago

nyurik commented 3 years ago

As discussed in https://github.com/openmaptiles/openmaptiles/pull/1252, we need a way for users to extend which fields to include with a layer without modifying the actual code. Instead, users will modify the main yaml file (i.e. openmaptiles.yaml) to specify needed changes.

tileset:
  layers:
    # ...
    - file: layers/transportation/transportation.yaml
      add_tags:
          maxspeed:
            description: Original value of [`maxspeed`](http://wiki.openstreetmap.org/wiki/Key:maxspeed) tag, if exists.
          maxheight:
            description: Original value of [`maxheight`](http://wiki.openstreetmap.org/wiki/Key:maxheight) tag, if exists.
          tracktype:
            description: Original value of [`tracktype`](http://wiki.openstreetmap.org/wiki/Key:tracktype) tag, if exists.
  # ...

The above would only work for layers that expose tags HSTORE field at the top SQL layer, and have a magic keyword as part of its SQL statement. Resulting layer modifications:

Please update the following sections with the exact specification, or add comments with proposals:

Imposm mapping file

TODO: what changes should tools do to the mapping file(s), and how should tools determine which file/table/key should be modified

Add field declarations

Layer specification is updated dynamically: new fields with their descriptions are added to the layer.fields map. TODO: decide if we should support values param, and if so, how. TODO: decide if we want to support updates to existing fields, for example adding more enum values to class. Is this needed at all? E.g. if we add a value to transportation class, it will not be straightforward to change which of them are included in z12

Update main SQL statement

TODO: We currently use %%FIELD_MAPPING: class %% for SQL modifications. We could use something similar here, e.g. introduce a new %%CUSTOM_TAGS%%? Replacement might be in the form NULLIF(tags->'maxspeed', '') AS "maxspeed",, or we may want to allow users to provide their own SQL.

zstadler commented 3 years ago

The requirements from the tools would be derived from the new structure of defining a layer.

The structure could be based on the following principals. The transition of the layers to the new structure can be done on a layer-by-layer basis, once the tool support is available.

The following tool-related enhancements could help implementing the above:

zstadler commented 3 years ago

A further improvement over the direct use of a {layer_tags} macro would be that a missing query entry of a layer will be computed from the fields defined in openmaptiles.yaml, the <layer>.yaml file.

It would roughly be:

    query:  (SELECT geometry, {layer_non_tags}, {layer_tags} FROM layer_<layer>(!bbox!, z(!scale_denominator!))) AS t

Where {layer_non_tags} is a macro that expands to all fields, such as class and brunnel, that are defined in the <layer>.yaml file but are not covered by {layer_tags}.

To be discussed: Should such SQL-generated fields be stored as separate columns, as entries in the tags column, or either way with an indication included in their <layer>.yaml entry.

Note: The {layer_tags} macro will still be useful when the query entry must be explicitly provided.

ZeLonewolf commented 3 years ago

One of the challenges I see is that there is a fundamental limit of imposm mappings - you can only specify a single tags block that applies for the ENTIRE mapping. Thus, if you decide that a tag should be added to the tags hstore for one layer, it will get added for every single layer. This was fine when we're only using the tags hstore for name and name:xx value, but it could cause unexpected results such as fragmentation or unwanted extra fields when adding a tag to a layer.

Now, in general, the information domain helps us here. If we add maxspeed because we want that for road rendering, it's unlikely to show up in landuse because maxspeed is specific to the transportation domain.

Given this limitation of imposm, we would need to introduce a computed column (probably in a separate table, accessed via JOIN over primary keys to ensure a proper update triggering sequence) in between the table that imposm creates, and the rest of the layer logic. The purpose of column would be to hold a minimized version of the tags object that contains only the tags that the layer actually cares about. The column could be computed by a function which is generated by openmaptiles-tools as determined by information in the layer yaml. The table would also need an update trigger to recompute the column each time a row changes.

This would allow us to build arbitrary tags objects that can get replicated up through the table sequence and into the layers that is minimized to contain just the data that the layer needs, and prevent object fragmentation, as described in https://github.com/openmaptiles/openmaptiles/pull/1252#issuecomment-932094169.

ZeLonewolf commented 3 years ago

We should have the ability in openmaptiles.yaml to specify:

  1. That a layer is disabled. For optional layers, this is better than simply commenting out option lines because it enables automated testing of optional layers by changing the yaml at runtime.
  2. That a layer should have its tables/SQL generated but not produce tiles. This would allow for us to have "base" layers that can be extended without creating unwanted entries in the tiles. Currently, if I want to render ONLY the transportation_name layer, for example, I can't do that without also generating the transportation layer in the tiles since one depends on the other.
zstadler commented 3 years ago

One of the challenges I see is that there is a fundamental limit of imposm mappings - you can only specify a single tags block that applies for the ENTIRE mapping. Thus, if you decide that a tag should be added to the tags hstore for one layer, it will get added for every single layer. This was fine when we're only using the tags hstore for name and name:xx value, but it could cause unexpected results such as fragmentation or unwanted extra fields when adding a tag to a layer.

It seems to be more than that. The tags column contains every value of every key used by any imposm table, not just keys in the include section!

For example, the height key is used as a column only by the building layer in tables that do not have a tags column. Nevertheless, some rows of the tags column of the osm_highway_linestring SQL table have height keys and values.

An hstore-type column in the SQL seems to be very handy for adding fields to OMT layers. On the other hand, we don't really need to use the imposm-generated tags column. Instead, we can use another hstore column, say omt_tags, that will only store the tags of the specific layer. The tools could create a layer-specific macro, similar to the class macro, that will be used to populate the omt_tags column from the imposm-generated table.

ZeLonewolf commented 3 years ago

An hstore-type column in the SQL seems to be very handy for adding fields to OMT layers. On the other hand, we don't really need to use the imposm-generated tags column. Instead, we can use another hstore column, say omt_tags, that will only store the tags of the specific layer. The tools could create a layer-specific macro, similar to the class macro, that will be used to populate the omt_tags column from the imposm-generated table.

Hmm. So you want to map them as regular columns, pack them into an hstore, and then explode the hstore in the layer function? We could make that work.

zstadler commented 3 years ago

Yes!

zstadler commented 3 years ago

My only concern with this approach is that once the new syntax is available in openmaptiles.yaml it will be applicable to all layers. It implies that the tags/hstore mechanism will need to be implemented in all of them - not a trivial task. This includes layers like aeroway which are rarely updated or extended.

@nyurik, @ZeLonewolf, Any ideas on how to minimize such a migration effort?

ZeLonewolf commented 3 years ago

Any ideas on how to minimize such a migration effort?

I suppose that work just needs to get done, one layer at a time.

nyurik commented 3 years ago

@zstadler we don't need to migrate anything at first -- only the layers that we want to support such functionality will implement it, and others should raise an error if a user tries to extend them.

ZeLonewolf commented 3 years ago

transportation is probably the layer that is the most desired for extension so far. I'll take a stab at injecting the tags hstore in the layer tables.

ZeLonewolf commented 3 years ago

One issue I've discovered is that tags that are injected into the tags hstore don't get imposm's type processing. If you have something mapped as a boolean in imposm, it automates the conversion of yes/no/true/false etc to a boolean. Same presumably applies to other types, such as int.

ZeLonewolf commented 3 years ago

Okay, actually this is not a problem at all thanks to postgres's typing: https://www.postgresql.org/docs/9.2/datatype-boolean.html

So SELECT 'yes'::boolean for example will return true. So this might work after all.