Closed hdelva closed 5 years ago
@hdelva Agreed on this approach. Could you perhaps build a small example on how to handle this on the higway tags for example?
This is indeed related to #21 but we probably need to discuss this first.
Actually, I think we'd still run into issues when trying to define the tags' semantics. I'm not an expert on OSM data but some tags seem to mean different things to different people. I came across https://wiki.openstreetmap.org/wiki/Path_controversy and https://wiki.openstreetmap.org/wiki/WikiProject_Denmark#Inconsistent_tagging_of_Danish_roads. And even when there is a consensus, communities often have their own conventions (https://wiki.openstreetmap.org/wiki/WikiProject_Belgium/Conventions/Highways) which may not be interoperable with other communities'.
To me this means we cannot refer to a wiki when building our vocabulary and that we should create our own non-ambiguous definitions. But we'll have to make sure we preserve the semantics the mapper intended and this doesn't seem possible when we add semantics to the existing tags. I suggest we create our own, more limited set, of properties.
For example, instead of trying to define the highway=footway
, highway=path
, highway=steps
, highway=pedestrian
tags, we can infer something like walk=yes
. Similarly, each community should agree that a highway=primary
is more important than a highway=secondary
, so perhaps these could become importance=1
and importance=2
properties. The underlying semantics of that field can then simply be defined as 'more is better'.
Does this make sense? And does it sound doable?
I think going down this road would be a step too far and potentially lead use down a path without an end. We also don't want to make too much assumptions on how the tiles are going to be used.
If we infer walk=yes
for example we are removing some of the meaning found in the OSM data and we are making a decision for users. The same for bicycle=yes
, but then ask what is a bicycle, this will also differ between countries. The same for all other vehicle definitions. I think the best-effort approach in OSM in finding a consistent tagging system and catering to all these differences is pretty successful.
Those wiki pages focus on the inconsistencies there are in OSM but these exist in any dataset of the road network of this size. Overall OSM is pretty good at describing the road network consistently and it works well for routing.
Fair enough, I suppose it doesn't make sense to focus on the inconsistently tagged data if most of it is fine.
That aside, what sort of example did you have in mind? Should I go over the wiki page and structure the tags I find there, or should I use some actual ways from OSM and clean those up?
Perhaps we could start with the highway and name tags? And move on from there?
I've created a minimal ontology along with a config file that should make it easier to publish data. Everything is still open for discussion of course, and feedback is welcome. I'm hosting a rendered version on http://hdelva.be/tiles/ns/terms.html for now, but ideally this'll move to somewhere on the openplanner site with a redirect from https://w3id.org/openstreetmap/terms# to wherever we host it. The original ontology and the mapping file live in https://github.com/openplannerteam/routable-tiles-ontology now.
The readme is a bit crude, so here's an example of what would change. What used to be
{
"@context":{
"osm:nodes":{
"@container":"@list",
"@type":"@id"
}
},
"@graph":[
{
"@id":"http://www.openstreetmap.org/way/13226858",
"@type":"osm:Way",
"rdfs:label":"Nieuwlandlaan",
"osm:width":"osm:6",
"osm:highway":"osm:tertiary",
"osm:surface":"osm:asphalt",
"osm:maxspeed":"50",
"osm:nodes":[
"http://www.openstreetmap.org/node/421872676",
"http://www.openstreetmap.org/node/156522403",
"http://www.openstreetmap.org/node/260601808"
]
},
]
}
Would become
{
"@context":{
"osm:highway": {
"@type": "@id"
},
"osm:hasNodes":{
"@container":"@list",
"@type":"@id"
}
},
"@graph":[
{
"@id":"http://www.openstreetmap.org/way/13226858",
"@type":"osm:Way",
"osm:name":"Nieuwlandlaan",
"osm:highway":"osm:Tertiary",
"osm:maxspeed": 50,
"osm:hasNodes": [
"http://www.openstreetmap.org/node/421872676",
"http://www.openstreetmap.org/node/156522403",
"http://www.openstreetmap.org/node/260601808"
],
"osm:hasTag": [
"width=6",
"surface=asphalt"
]
},
]
}
The main differences are:
osm:highway
type in the context, because they're now semantically defined osm:nodes
to a more conventional osm:hasNodes
. I don't think we should do this for the properties that are derived from OSM tags though. osm:tertiary
became osm:Tertiary
because named individuals are usually in PascalCase.osm:maxspeed
became numericname=*
key became osm:name
because the rdfs:label
property is meant for human consumption.hasTag
list with all the tags that don't have defined semantics (yet).This is excellent, exactly what we needed, we will do some coding and get back to you once we have something useful! :+1:
Work is happening in: https://github.com/openplannerteam/routeable-tiles/tree/features/structure-osm-tags
Main todo still left is to parse the mapping json.
This is now done in the feature branch. Waiting for #28 for deployment.
More work is being done in this repo:
Possibly related to #21.
It seems like there are two kinds of tags; one is structured with a predefined set of possible values, the other seems to contain free-form values. Both are currently represented as string literals that start with
osm:
.For example:
osm:highway
seems stuctured, with its values documented on https://wiki.openstreetmap.org/wiki/Key:highway#Values.osm:destination:lanes
seems less structured, its possible values seem to be concatenated informal location descriptions such asosm:U.Z.|Oudenaarde;Zwijnaarde;EXPO;Haven|Oudenaarde;Zwijnaarde;EXPO;Haven
.I'd suggest these changes:
osm:
prefix to the second kind of tags, and that just keep it as a string literal.@type
to@id
. This obviously depends on the vocabulary that still has to be defined but it'd be nice if someone with OSM experience could give us a list of structured tags and values.