Open nyurik opened 3 years ago
The requirements from the tools would be derived from the new structure of defining a layer.
The structure could be based on the following principals. The transition of the layers to the new structure can be done on a layer-by-layer basis, once the tool support is available.
tags
column stores all OSM tag information, it will be used as the primary source for their values.
tags
column will be added to the vector tile.columns
entry in mapping.yaml
files will be restricted.
name: geometry
: type: geometry
or type: validated_geometry
name: tags
: type: hstore_tags
type: string
columns, when needed. For example:
name: osm_id
: type: id
name: z_order
: type: wayzorder
name: area
: type: area
name: mapping_key
: type: mapping_key
name: <tag>
: type: mapping_value
The following tool-related enhancements could help implementing the above:
{layer_tags}
macro that expands all entries in tags
query
entry of <layer>.yaml
files.{name_languages}
.name
-related entries in tags
, so {name_languages}
need not be used with it.delete_language_tags(tags)
SQL macro for removing name
-related tags from tags
transportation
and water
layers where names are not included in the layers themselves, but are extracted from OSM to be included in the transportation_name
and waterway
layers, resp.DELETE(tags, ARRAY['name', 'int_name', 'loc_name', 'wikidata', 'wikipedia', 'name:<lang>', ...])
to efficiently compute tags - slice_language_tags(tags)
.A further improvement over the direct use of a {layer_tags}
macro would be that a missing query
entry of a layer will be computed from the fields defined in openmaptiles.yaml
, the <layer>.yaml
file.
It would roughly be:
query: (SELECT geometry, {layer_non_tags}, {layer_tags} FROM layer_<layer>(!bbox!, z(!scale_denominator!))) AS t
Where {layer_non_tags}
is a macro that expands to all fields, such as class
and brunnel
, that are defined in the <layer>.yaml
file but are not covered by {layer_tags}
.
To be discussed: Should such SQL-generated fields be stored as separate columns, as entries in the tags
column, or either way with an indication included in their <layer>.yaml
entry.
Note: The {layer_tags}
macro will still be useful when the query
entry must be explicitly provided.
One of the challenges I see is that there is a fundamental limit of imposm mappings - you can only specify a single tags
block that applies for the ENTIRE mapping. Thus, if you decide that a tag should be added to the tags
hstore for one layer, it will get added for every single layer. This was fine when we're only using the tags hstore for name
and name:xx
value, but it could cause unexpected results such as fragmentation or unwanted extra fields when adding a tag to a layer.
Now, in general, the information domain helps us here. If we add maxspeed
because we want that for road rendering, it's unlikely to show up in landuse
because maxspeed
is specific to the transportation domain.
Given this limitation of imposm, we would need to introduce a computed column (probably in a separate table, accessed via JOIN
over primary keys to ensure a proper update triggering sequence) in between the table that imposm creates, and the rest of the layer logic. The purpose of column would be to hold a minimized version of the tags
object that contains only the tags that the layer actually cares about. The column could be computed by a function which is generated by openmaptiles-tools as determined by information in the layer yaml. The table would also need an update trigger to recompute the column each time a row changes.
This would allow us to build arbitrary tags
objects that can get replicated up through the table sequence and into the layers that is minimized to contain just the data that the layer needs, and prevent object fragmentation, as described in
https://github.com/openmaptiles/openmaptiles/pull/1252#issuecomment-932094169.
We should have the ability in openmaptiles.yaml
to specify:
transportation_name
layer, for example, I can't do that without also generating the transportation
layer in the tiles since one depends on the other.One of the challenges I see is that there is a fundamental limit of imposm mappings - you can only specify a single tags block that applies for the ENTIRE mapping. Thus, if you decide that a tag should be added to the tags hstore for one layer, it will get added for every single layer. This was fine when we're only using the tags hstore for name and name:xx value, but it could cause unexpected results such as fragmentation or unwanted extra fields when adding a tag to a layer.
It seems to be more than that. The tags
column contains every value of every key used by any imposm table, not just keys in the include
section!
For example, the height
key is used as a column only by the building
layer in tables that do not have a tags
column. Nevertheless, some rows of the tags
column of the osm_highway_linestring
SQL table have height
keys and values.
An hstore
-type column in the SQL seems to be very handy for adding fields to OMT layers. On the other hand, we don't really need to use the imposm-generated tags
column. Instead, we can use another hstore
column, say omt_tags
, that will only store the tags of the specific layer. The tools could create a layer-specific macro, similar to the class
macro, that will be used to populate the omt_tags
column from the imposm-generated table.
An
hstore
-type column in the SQL seems to be very handy for adding fields to OMT layers. On the other hand, we don't really need to use the imposm-generatedtags
column. Instead, we can use anotherhstore
column, sayomt_tags
, that will only store the tags of the specific layer. The tools could create a layer-specific macro, similar to theclass
macro, that will be used to populate theomt_tags
column from the imposm-generated table.
Hmm. So you want to map them as regular columns, pack them into an hstore, and then explode the hstore in the layer function? We could make that work.
Yes!
My only concern with this approach is that once the new syntax is available in openmaptiles.yaml
it will be applicable to all layers. It implies that the tags/hstore mechanism will need to be implemented in all of them - not a trivial task. This includes layers like aeroway
which are rarely updated or extended.
@nyurik, @ZeLonewolf, Any ideas on how to minimize such a migration effort?
Any ideas on how to minimize such a migration effort?
I suppose that work just needs to get done, one layer at a time.
@zstadler we don't need to migrate anything at first -- only the layers that we want to support such functionality will implement it, and others should raise an error if a user tries to extend them.
transportation
is probably the layer that is the most desired for extension so far. I'll take a stab at injecting the tags hstore in the layer tables.
One issue I've discovered is that tags that are injected into the tags hstore don't get imposm's type processing. If you have something mapped as a boolean
in imposm, it automates the conversion of yes
/no
/true
/false
etc to a boolean. Same presumably applies to other types, such as int.
Okay, actually this is not a problem at all thanks to postgres's typing: https://www.postgresql.org/docs/9.2/datatype-boolean.html
So SELECT 'yes'::boolean
for example will return true. So this might work after all.
As discussed in https://github.com/openmaptiles/openmaptiles/pull/1252, we need a way for users to extend which fields to include with a layer without modifying the actual code. Instead, users will modify the main yaml file (i.e.
openmaptiles.yaml
) to specify needed changes.The above would only work for layers that expose
tags
HSTORE field at the top SQL layer, and have a magic keyword as part of its SQL statement. Resulting layer modifications:Please update the following sections with the exact specification, or add comments with proposals:
Imposm mapping file
TODO: what changes should tools do to the mapping file(s), and how should tools determine which file/table/key should be modified
Add field declarations
Layer specification is updated dynamically: new fields with their descriptions are added to the
layer.fields
map. TODO: decide if we should supportvalues
param, and if so, how. TODO: decide if we want to support updates to existing fields, for example adding more enum values toclass
. Is this needed at all? E.g. if we add a value to transportation class, it will not be straightforward to change which of them are included in z12Update main SQL statement
TODO: We currently use
%%FIELD_MAPPING: class %%
for SQL modifications. We could use something similar here, e.g. introduce a new%%CUSTOM_TAGS%%
? Replacement might be in the formNULLIF(tags->'maxspeed', '') AS "maxspeed",
, or we may want to allow users to provide their own SQL.