atorger / nvdb2osm

The Unlicense
9 stars 2 forks source link

Tagging comments #15

Closed NKAmapper closed 3 years ago

NKAmapper commented 3 years ago

I have now been able to see more example files. They are very impressive, and I think I understand the massive effort which has been undertaken. Kudos.

I have a few observations which perhaps could be helpful. Please do not regard this as negative feedback - they are only details. And I know from my own work how valuable it can be to have another pair of eyes on the results.

Here we go:

  1. Traffic signals: crossing=traffic_signals seems to be used instead of highway=traffic_signals at highway junctions with no cycleway/footway. Probably a mistake.
  2. Maxheight is usually put on the way. It is not permitted on a node, according to the wiki. (But having it on a node is better than not having it at all.)
  3. Roundabouts shall not contain the street name, acoording to the wiki (but rather the name of the junction, or leave empty).
  4. Are there any turn:lanes data in NVDB?
  5. Bus lanes: There are a number of lanes:bus= and lanes:bus :forward= . This tag is only counting the lanes, but it does not specify a restriction. I think motor_vehicle=no + bus=yes is the required tagging, or bus:lanes= | |* etc if we now which lane is involved. Also, lanes=2 + lanes:bus=2 has the implication that motor_vehicle=no, but the latter is sometimes not tagged in the files. I guess the same comments apply for taxi, psv, hgv etc.
  6. Lanes: My personal prefernce is to avoid lanes=1 (on oneway streets) and lanes=2 (on two-way streets) if that is anyway the default value (so it is redundant).
  7. :forward is not necessary on a oneway road. Tagging such as oneway=yes + lanes=3 + lanes:forward=2 is confusing, and not correct. Perhaps lanes:motor_vehicle=2 would be correct (but not needed if another tag is lanes:bus=1)
  8. :backward on oneway streets is likely a mistake.
  9. Pedestrian streets get quite busy tagging (Gamla stan, for example). Perhaps emergency=yes and hazmat=no could be dropped. On the other hand, bicycle=yes could be necessary to get bike routing on some of the routing engines (if bikes are permitted on the pedestrian street).
  10. Ferrys could get more tags, for example ferry=primary, ref=* , motor_vehicle=yes/no, foot=yes etc.
  11. Crossing: Need to decide if this tag should be included on ways or not. Creates more partitions.
  12. Some of our decoding of the crossing, traffic_calming and footway/cycleway tables differ, but not sure who is "right" :) For example GCM type 9 seems to be on asphalt most of the time, so perhaps path is not a good default.
  13. Oneway on cycleway and path is a bit strange.
  14. Platforms: They are tagged highway=platform + railway=platform + public_transport=platform. Perhaps railway=platform is enough (it seems to be the one mostly used).
  15. I have already commented on the highway classes in #10. For example, >90% of highway=track seems wrong in urban/semi-urban municipalities, according to tagging conventions, so highway=service will save a lot of time for the importer (which will then have to change just 10%).
  16. Tertiary is hardly used. Perhaps a wrong cutoff for functional class?
  17. Would be nice to have motorway_link, trunk_link, primary_link.
  18. In some cases, briges and tunnels are connected with the road underneath (or above).
  19. In some cases, there is both a tunnel and a bridge which are crossing, but only one of them is needed (preferably the tunnel if we have a cycleway underneath).
  20. I think the name=* for cycleways and footways are supposed to be used for specific names of the cycleway/footway, and they should not have the name of the street next to it. Some cycleways have their own name (C-Cykelled/Namn) which could be used instead, for example Kattegattleden in Göteborg.
  21. Covered=yes could be used for footways inside buildings.
  22. Width: This tag are on 70% of all streets in for example Göteborg. I think width should be omitted unless there is a special/signed restriction, for example a narrow duct under a railway. The reason is that edits after the import is seldom going to get the width right, so we will get a confusing mix of imported values and new edits based on guessing after a few years. Even during import, it is difficult to adjust the width value if for example the number of lanes is modified. The width tag also causes ways to be broken into several partitions. Some of the width values have a lot of decimals, by the way.
  23. Environmental zone: This tag is on every street inside the zone, for example on 4.700 streets in the centre of Göteborg. These zones are used for restricting the use of fossile fuel cars, and are likly to change in the future. I think it is better to make a polygon around the defined zone instead of putting it on every street. The reason is that it will be too difficult to maintain it on every street in OSM. These zones are likely to be modified in the future, and it will be easeier to modifiy a polygon rather than thousands of streets. Also, we cannot expect every user in OSM to remember including it on a new street which he or she might be adding within the zone.
  24. Hazmat: This tag is used for a large proportion of highways, for example on 78% of all streets in Göteborg. We are likely not going to be able maintain such a tag in OSM consistently. Perhaps it could be kept for hazmat=designated only, or be used for motorway/trunk/primary/secondary only, or dropped altogether.
  25. Maxweight: It is included for a large proportion of highways, for example 14.450 streets in Stockholm (57%). The large majority of these roads are not signed with any maxweight restriction (the restriction is provided in lists from Trafikverket), so we are getting into a difficult territory in OSM. Would it be possible to only use this tagging where it is signed on the road (if that data exists)? Or it could be included only for bridges, which is usually the problem. Also, if the quality of this data is weak for municipliaty and private roads, it could be included only for state and county roads.
atorger commented 3 years ago

Great feedback, I'll look into it as soon as I can in more detail, quick comments:

indeed I think we could remove some of the fields. The initial design was try to extract all information there is that is actually extractable without giving it much thought, and then work from there when we see what we got. I'm perfectly fine with removing some fields. I haven't needed to consider that in the rural areas I've worked in so far, except for one thing - maxspeed, which I'll file a separate issue for.

atorger commented 3 years ago

It will take some time to investigate and fix all this, but here's a first readthrough.

An overall comment, some things you mention are hard to maintain and/or may be distorted by further manual edits. This is a principle that I guess we need to decide on. I have the idea that in the future when things are placed properly in the map with good coverage you can do updates much more automated, and therefore you can allow for richer amount of tags, also tags that cannot practically be mapped manually on the ground by amateurs, but need institutes like Trafikverket and actual road maintainers. However, I don't have a strong opinion about this, so if we decide to strip away some tags that's okay.

  1. I'll look into that, I have also the other issue of potentially simplifying cycleway crossings I haven't looked into yet
  2. I think maxheight on node attached to a road makes sense and should work for data consumers, just like a barrier node, and I'm fine with keeping that. I could look into if there's a way to convert them automatically though, but I'm not sure I think it's worth it. If there's a push for it I'll do it, but I'm fine with as it is. Sometimes maxheight is tied to a tunnel, quite easily converted, but other ways it's tied to a bridge passing above, sometimes more than one, which makes it harder.
  3. Roundabouts do get the name of the roundabout already, however if there is no name of the roundabout the other name is kept and NVDB names the roundabouts with street name. I'll look into that, should be easy to remove.
  4. Unfortunately not. Multi-lane tagging is limited :-(
  5. We don't know which lanes are which from NVDB, but the validator warns about it and one can usually figure it out from aerials, so it's unfortunately a manual thing. I'll look into the other comments, but overall the NVDB info is limited here so I don't think we can reach 100%, there's no information if both bus or taxi is allowed or only bus, I think in most cases in Sweden it's only bus though.
  6. Thanks I had not considered that, my personal preference in general is to avoid redundant tags if possible, so I'm all for. When merging roads now I see a lot of "oneway=no" in the database now on major roads which drives me crazy :-)
  7. There is a simplify pass for oneway already that should remove these, so that's a bug, well spotted
  8. Yes I've seen that on some place forgot where, good if you have a reference so I can see how it occurred. I think it's actually may be the NVDB source data which has the error.
  9. Complex, need to look into that
  10. Ack, need to study
  11. I'm for simplifying them (separate issue), just haven't got around to do it yet (not much of an issue in rural areas).
  12. Ack, need to study
  13. Need to look at source NVDB data which caused this, there may be a reason.
  14. Ack
  15. I disagree on this point. Service should not be used on skogsbilvägar, track should for those with minor importance, and unclassified with more importance. I'm not trying to push my own idea here though, that's how it has been done in Sweden for years. If I'm wrong, I'll change tagging. In any case we need to discuss that with the community, highway tagging seems to be a quite hot potato.
  16. I think it's right (easy to check though), but still one often need to change if to match what's current in OSM, which is what I follow 95% of the time.
  17. Agree, need to study if possible
  18. Yes this is a NVDB data error. As the validator warns about it I thought it's not really worthwhile to solve it. It becomes worse when you have bridges over bridges or tunnels and things like that, those needs to be manually resolved.
  19. This should be resolved to some extent already, seems like it should be a rare thing. May be possible to do more in the bridge/tunnel resolve algorithm. However, note that he bridge/tunnel data in NVDB is not that great, so I'm reluctant making the algorithm "too smart" as the input is not fully reliable.
  20. Names of cycleways are not resolved, they are as reported by NVDB, and as cycleways are mostly managed by the municipalities themselves you get what they report. I do manually change names from time to time. Named cycleways are quite rare, seems to be a big city thing.
  21. Agree, need to study if possible
  22. I'm undecided about this tag. That width is uncommon for manual mapping is natural, general people just don't measure the road. It causes shorter sections. However, it's also a tag that relatively easily can be synced automatically in the future. Lots of decimals comes from database, but should be simplified, noted. My experience with the width data so far is that it is of good quality except for skogsbilväg (where it's already excluded), so I've decided to keep it in for now. Width is a property of the roads, width restrictions should be noted not with width, but with maxwidth, and this is already translated from the BegrFordBredd layer. I think one of the ideas of actually using NVDB is to get data that you otherwise couldn't practically get. But I won't fight hard for this. One compromise would to keep it for the big roads and skip it for residential and below, I think it has most value on the big roads (trunk/primary etc), what do you think?
  23. Maybe just remove this environmental zone, experimental tag anyway. My idea was that it should be resynced using tools, not really manually updated.
  24. My idea was resync hazmat with tool, not manual maintenance. But sure, could look into some solution like with width, using it only on larger roads.
  25. The signed are in with maxweightreighting, maxaxleload, and similar tags, maxweight stems from the Bärighetsklass layer which has large and fairly generic coverage, but I think it's quite relevant road information too. I would have preferred if there was a bärighetsklass tag, but maxweight translations is what I found that fit best in OSM.

In my first correction run I'll look into the clear cases above and post a comment of what I've done, then I'll look further into the things where we may disagree and further discussion is needed. I have lot on my table now so can be slow to update. Or fast. I'm good at procrastinating and not do what I should really do by working with this instead...

NKAmapper commented 3 years ago

Will walk though tomorrow. I should have mentioned that the observations were made in the Stockholm and Göteborg files (I wanted big samples), if you want to search for any of the tags.

NKAmapper commented 3 years ago

A few comments:

5 I think cases like oneway=yes + lanes=2 + lanes:bus=2 could easily be resolved to lanes=2 + bus=yes + motor_vehicle=no, right? And on a two way street, lanes=2 + lanes: bus :forward=1 could be resolved to lanes=2 + bus:forward=yes + motor_vehicle:forward=no. And similar for the other tags and other number of lanes. I try to do so in my script. Deducting the access tag is important for routing if it can be done. If taxi/bus is a problem, then psv could be used instead (that is 90% of the cases in Norway). I think this type of tagging cannot easily be resolved from the armchair, or it would require considerable time to look up in Mapillary, so it would be valuable information. But nitty-gritty :)

20 It seems that NVDB has just copied the name of the street to the cycleway, but the convention in OSM is to only have a name for the cycleway if it actually has it's own name and sign, which could be the case in for example a park. Most cycleways in OSM do not have a name at all. One (more elaborate) method could be to build a set of street names used by ordinary streets, and then only tag the cycleway with a name if that name is not included in the set. Then most of the cycleways in parks and forests would get their name.

22 The compromise seems ok. 23 Agree. 24 Larger roads would be ok, I think. 25 Perhaps keep it for larger roads?

atorger commented 3 years ago

1: DONE: highway traffic signals should now be correctly tagged 2: voting for no change on maxheight being a node 3: DONE: roundabout street names removed, specific roundabout names kept 4: no action, as far as I know turn lane information does not exist in NVDB 5: DONE: access restrictions now added when all lanes are bus lanes 6: DONE: no redundant lanes 7: DONE: no redundant lanes:forward on oneway 8: no action? Not sure exactly what you found though. I looked at Stockholm with :backward conditionals on oneway=yes, and the NVDB source data states the backwards direction in FörbudTrafik layer in these cases. Probably wrong in the NVDB data, but these only occured in three places in Stockholm so I don't think there is a need for automatic correction. Auto-correction of source data errors is risky so should only be done when really needed. 9: DONE: hazmat/width/maxweight/environmental zone dropped due to other bullets. Regarding bicycles on "gågata", the traffic law default is max 5 km/h (some sources state 7), so it's allowed, but not really something you would want to route through unless you must. It's very common with "please walk with your bike" signs, ie not forbidden ride, but preferably not to. But as the maxspeed is set to 5 km/h I guess one could set bicycle=yes, which I have now unless there's already a vehicle or bicycle tag due to some other reason. 10: DONE: more tags for ferry routes now 11: DONE: separate issue to turn cycleway crossings to nodes (high priority) https://github.com/atorger/nvdb2osm/issues/8 12: DONE: see later comments. 13: DONE(?): the oneway cycleways are clearly marked as such in the source data, typically left and right side of large roads in big cities, so no action on those. The quirk with rare occurrence of oneway on path is due to "Annan cykelbar förbindelse" being translated to highway=path + bicycle=yes, which I think is a good translation, but if there's oneway put on top the script now upgrades it to cycleway. I've seen this NVDB tag used for longer paths not really maintained by the municipality as cycleways but they are useful links for cyclists. In Göteborg dataset it also used on short 10 meter non-cycleway segments just to tie together cycleways. Maybe "Annan cykelbar förbindelse"should be made as cycleway alltogether, but in a way path seems more correct since it's not really a cycleway as the ones marked as cycleway. On the other hand it's not a "path" either, it's just a routing technicality over a piece of pavement joining disconnected cycleways. 14: DONE: indeed too many tags for railway platforms, now reduced to just railway=platform 15: DONE: new tag interpretation, and track upgrade algorithm on top 16: DONE: there was indeed a bug causing tertiary not to appear in cities. 17: DONE: Highway *_link filed as separate enhancement issue https://github.com/atorger/nvdb2osm/issues/20 18: Bridge/tunnel false connections filed as separate (low priority) issue. 19: Bridge/tunnels should be resolved already where possible, need to point at specific bug to solve so I can see source data etc that causes the issue 20: DONE: redundant cycleway names are now removed https://github.com/atorger/nvdb2osm/issues/22 21: No action? Using tunnel=building_passage now, and I think that is more likely to be the correct tag rather than covered=yes: https://wiki.openstreetmap.org/wiki/Tag:tunnel%3Dbuilding_passage (both may not be used at the same time). A brief look at Stockholm I'd say out of 15 passages, ~12-13 are passages through larger buildings, while the remaining two are covers over the footway. One thing DONE though: when it's bridge=yes then it converts to covered=yes 22: DONE: width removed for all but the larger roads 23: DONE: environmental zone parsing removed 24: DONE: hazmat only kept on major roads unless conditional (but I think conditional is only on major roads anyway) 25: DONE: maxweight from bärighetsklass only kept on major roads unless conditional (same here conditional only on major roads)

All other numbers I haven't had time to look at yet. It's probably wise not to comment so much in this thread before I've worked more on the list. I'll edit this message as I've gone through more things.

NKAmapper commented 3 years ago

Very efficient :)

9: Come to think of when you mention 5 km/h - is that maxspeed just a "dummy" value perhaps, and maxspeed could be omitted in those cases? The description in the table is "walking speed".

atorger commented 3 years ago

I'm not sure exactly how the law is written, the 5 km maxspeed is in the NVDB data and you aren't allowed by law to exceed "walking speed", but I do think it's a "dummy" value yes. There is this poposal: https://wiki.openstreetmap.org/wiki/Proposed_features/maxspeed_walk

But I thought that maxspeed=5 even if dummy is a more widely understood tag. I think we should have at least something, removing it completely I don't think is good, as there is a law for it, even though it may be written as "walking pace" rather than a defined speed.

Surfing a bit on the subject it seems like the law states "walking pace", the police enforces a max limit of 7 km/h, but when stated as a speed instead of "walking pace" the number 5 is more commonly used than upsetting-the-police-limit of 7. https://www.dt.se/artikel/7-km-h-pa-gangvag

atorger commented 3 years ago

I'm changing the handling of GCM 9: marking it cycleway if it's paved or oneway, otherwise bicycle path. I think this will be a quite good default. As far as I know there is no tag in OSM to make non-cycleway route connections between disconnected cycleways which GCM 9 is used for inside big cities, these are in OSM just tagged cycleway anyway even if it's just a broad section of general sidewalk pavement. If there is special tagging in OSM for this, we should use that instead... but for now I make them cycleway.

            # This type unfortunately has multi-uses. In larger cities it's commonly used to
            # connect disconnected cycleways, eg in places you need to pass 10 - 20 meters of
            # pavement to get on to the next section. But it's also used for longer sections
            # of unpaved tracks that make practical links for cyclists but are not really
            # maintained as cycleways.
            #
            # To differ between these we look at road surface, and if it's marked oneway
            # (happens in some cases in cities) we also upgrade it to cycleway
            #
atorger commented 3 years ago

About differences in our barrier tables:

For "låst grind eller bom", I use the more unspecified "gate" instead of "lift_gate", as in reality there are different designs, they can be lift gates, swing gates and even sliding gates. The most common design is indeed lift gate, at least here in the north where I've seen many of those, but I don't think we need to specify type. "Låst grind eller bom" is almost the exclusively the type used in forestry and private roads, so as usual it's a "lower quality" data set in terms of exact specification of type, so I vote for keeping "gate" on this.

"Eftergivlig grind", I have changed my script to use your translation "swing_gate". There are only 24 of these up in north all reported by municipalities, almost all on cycleways and all the ones I've seen in person are indeed swing gates. I think they are designed so service vehicles can pass them.

"Betonghinder" I have translated to block and you jersey_barrier, I think both are ok. These are in reality usually a single "betongsugga", often in art shape (in Stockholm they are shaped as lions, here in Luleå they are wolverines(!)). Although single element link-jersey_barriers are indeed also common, I think jersey_barrier in OSM is more suitable when they are used in its linked form, to make barriers between lanes in roads. So I vote for keeping the "block" translation. Making an overpass search it seems like block is used more commonly today for node barriers.

On the others we have same translation.

NKAmapper commented 3 years ago

Barriers: I have been thinking of barrier=gate to be used specifically for "grind" rather than gate being a general barrier tag, perhaps because of the picture in the wiki. Not sure. Anyway, this is why I have been using lift_gate as the default, since that tends to be right in 90% of cases. Not a big thing though.

On cycleways I mostly see swing_gate and a few cycle_barrier. Block is fine.

atorger commented 3 years ago

About differences in our traffic_calming tables:

Traffic calming are generally reported by municipalities, so higher quality than private road data, but lower than Trafikverket data, so I have seen fair bit of errors/inconsistent use.

gupp: I use bump, you hump. In reality it can be both. I chose to use "bump" as "speed bump" is the more generic term (outside the world of OSM wiki) of which "speed hump" is a sub-type. I vote to keep "bump".

"Chicane" are not mapped in a single node in NVDB, it seems to be commonly mapped as two nearby single-sided choker nodes (TYP=1). An algorithm code be made to merge these, but then more research would be needed to see how they are used. As I don't trust the data too much from initial research I don't think it's worthwhile.

Refug: changed to your translation, island, look at some actual uses and seems to be correct.

Avsmalning (typ 3): I translate these to "choker", just like typ 1. The difference between the types in NVDB is that typ 1, avsmalning till ett körfält means that it's so narrow only one car can pass, while this choker makes the road narrower, but still wide enough that two cars can pass (if you are brave). In OSM typ 1 should ideally be choker with an additional priority tag, but as there's no information on which side the choker is we can't resolve priority, so typ 1 and typ 3 ends up being the same.

The others are we have the same. I have not found any practical use of the Läge tag (enkelsidig, dubbelsidig, genomgående), as for chokers we need to know the side if it's enkelsidig. It would be used if we want to merge two choker nodes into one chicane node.

NKAmapper commented 3 years ago

Not very important, but I think the 30 cm bump is very rare. I was not able to see any of those on the aerials. I believe hump is not a subtype of bump, neither in the OSM wiki nor in Wikipedia. There are equally many bump and hump tagged in Sweden, but I suppose bump might be a tagging mistake in many cases.

atorger commented 3 years ago

About differences in GCM:

Some of the types are in NVDB marked as G+C, ie both cycleway and footway, I don't translate these statically but have a resolving algorithm for crossing sections to resolve which should be footway and which should be cycleway, like this: Rules for marking road/street crossings: if cycleway on both ends: make cycleway else if footway on both ends: make footway else if footway in one and and cycleway in the other: make footway else: make path It seems to work well.

I don't use "pedestrian" as pedestrian streets are in the road network. Looking at how the data is used, footway/resolving seems more approriate (typ 24, set statically to footway and 26, resolved depending on neighbors).

Way segments marked "crossing" is nowadays simplified to node crossings in a later step, so there are no crossings segments left.

About the use of "segregated", on TYP=3 it seems incorrect as if there is a footway it's mapped separately. There's TYP=28 for segregated cycleways and footways on the same way, there the use of "segregated" is correct. However, its only used for crossings and make little sense now when we simplify these to nodes, so I've not added it in (added a comment in the code though). If one would go back to not use node crossings we should use segregated.

atorger commented 3 years ago

I'm no english expert, it was this wiki article that made me think that speed hump is a sub-type of speed bump: https://en.wikipedia.org/wiki/Speed_bump What do you think? Should I change it or not? Where I live there are indeed some narrow speed bumps, but I agree, the hump type is certainly more common.

Edit: I changed it to hump, actually reading the whole wikipedia article and not just the first paragraphs made it more clear that bump and hump are separate types, and then I agree hump is a better guess.

atorger commented 3 years ago

On gate vs lift_gate, no big deal for me either. Looked at overpass and both are used extensively, but "gate" is used about 2 - 3 times more than "lift_gate", so I vote to continue with "gate", with the intention that it's a general gate.

(I think one of OSM's design problems is that they rarely specifically define a hierarchy from less specified to more specified tags, it's a rather mixed bag. In any case, the purpose of using "gate" here is that I interpret it as a more generic term, like "block" more generic than "jersey_barrier")

atorger commented 3 years ago

Going through the original list of 25 points I think I've gone through them all now. A couple are registered as open enhancement issues, but I've implemented the high priority ones.

NKAmapper commented 3 years ago

Very good :) I can review when the new files are available.

atorger commented 3 years ago

Closing this issue. Most/all should be in sync with this issue. Open a new issue if there are further things to improve.