osmlab / fixing-polygons-in-osm

Fixing (multi)polygons in OpenStreetMap
25 stars 4 forks source link

"uninteresting"/"unimportant" tag list for detecting old-style polygons #18

Closed ImreSamu closed 7 years ago

ImreSamu commented 7 years ago

I didn't find any information, so maybe somebody can help me.

What is the full "uninteresting"/"unimportant" tag list for detecting Old-style polygons ?

My first guess : the keys from osm2pgsql default syle , marked with 'delete'

note
note:*
source
source_ref
source:*
attribution
comment
fixme
created_by
odbl
odbl:note
SK53_bulk:load
tiger:*
NHD:*
nhd:*
gnis:*
geobase:*
accuracy:meters
sub_sea:type
waterway:type
KSJ2:*
yh:*
osak:*
kms:*
ngbe:*
naptan:*
CLC:*
3dshapes:ggmodelk
AND_nosr_r
import
it:fvg:*

We have a better definitions ?

grischard commented 7 years ago

JOSM also has converted_by and import in the keys it considers discardable.

I'd add history and FIXME.

tyrasd commented 7 years ago

Maybe also add name and name:* (and similar tags like ref, etc.)?

A completely different approach would be to flag every multipolygon that has no area tag set. One could use an inverse version of osm-polygon-features or id-area-keys for that (maybe with added variants of the respective tags, e.g. disused:<area-key>, abandoned:<area-key>, etc.). This approach could maybe result in a few false-positives, but wouldn't overlook any otherwise possibly false negatives.

grischard commented 7 years ago

@tyrasd it would be very interesting to compare the two. The relevant and slightly complex piece of code, in JOSM, is hasAreaTags.

It's interesting how these three sources have slightly different ideas of what's an area and what isn't.

ImreSamu commented 7 years ago

@tyrasd

Maybe also add name and name:* (and similar tags like ref, etc.)?

In my mind we have a 4 category of osm keys

I would like to separate the tagging errors and the old-style polygons.

So they are maybe a tagging problems ( no primary osm keys )

And they are maybe an old-style multipolygons candidates ( only "uninteresting"/"unimportant" OSM keys )

A completely different approach would be to flag every multipolygon that has no area tag set. One could use an inverse version of osm-polygon-features or id-area-keys ...

my approach is similar ;

joto commented 7 years ago

I don't think we have to make it all so complicated. Just fixing everything that has no tag except type=multipolygon and we are a good way there. Ignoring source and created_by and we get almost 100%. After that we can look at other tags and the obvious place is osm2pgsql. Whatever osm2pgsql thinks is old-style will have to be dealt with eventually. If you start looking more closely into the tag combinations, you'll get to many things that aren't really about fixing old-style multipolygons but general "fixing of suspect" data, and that's not are focus here.

ImreSamu commented 7 years ago

side note: As I see the NEW Lua based openstreetmap-carto version will have more "unwanted" keys, than the current version. See delete_tags, delete_prefixes in openstreetmap-carto.lua