mapbox / osm-compare

Functions that identify what changed during a feature edit on OpenStreetMap.
ISC License
38 stars 15 forks source link

Features that are rarely created in the real world #131

Open bkowshik opened 7 years ago

bkowshik commented 7 years ago

Ref: https://github.com/mapbox/osm-compare/issues/112


There are features that are created not often in the real world. Ex: It is not everyday that a new airport is constructed. So, if these features are rare in the real world, shouldn't they also be rare on OpenStreetMap. Should we they flag these rare features for manual review?

screen shot 2017-03-30 at 12 43 04 pm

A harmful aeroway=aerodrome created by a new user in Georgetown

What are the other features are rarely created in the real world? 💭


cc: @planemad @manoharuss @geohacker @amishas157

bkowshik commented 7 years ago

railway track from no where

screen shot 2017-04-11 at 12 11 06 pm

cc: @srividyacb

amishas157 commented 7 years ago

The idea is wonderful @bkowshik 👌 . Yes, features which are rare in real world should be rare on OSM and not created every now and then. And if these incidents happen, we should definitely 👀 these. Why not let's take this ahead. So have few thoughts around how can we go about this.

What are the other features are rarely created in the real world? 💭

Following is the list which I could think of rare and critical features present in OSM.

Feel free to add any thing which would make sense for the compare function.

Also, have couple of questions which I am looking forward to along with bringing the compare function. This is entirely to understand these features history and not act as a blocker / prerequisite step for this compare function.

Next actions:

amishas157 commented 7 years ago

WIP: https://github.com/mapbox/osm-compare/pull/157

willemarcel commented 7 years ago

It's common Mapsme users map their home as castle, for example: https://osmcha.mapbox.com/48163958/ I think we could add castles to this list of uncommon features.

poornibadrinath commented 7 years ago

I took a look at a few changesets today and the pattern I noticed was the changesets that are currently flagged are all of mountain tags. A few things that I had in mind when I was going through these changesets were:

Point 1: Are we gonna flag every detail being added? A few changesets are having name tags being added. Would it be valuable to flag them?

Point 2: Is flagging mountain ranges important? Many times, they are mapping the ones that are not mapped. The ones I have noticed so far has been like that. Can we tweak it a bit to generate less noise.

Point 3: If the mountain keys are removed, it is very rare that you get any changesets regarding rare and critical features. Any other features that are really critical and less added to be included in this list?

As Wille suggested, random castles being added on the map has a higher priority. Also, are view points and monuments important? Right now they are being tagged on maps.me filter. But would it make sense it ramp it up a bit. Are they critical and rarely added?

Question: Admin boundaries and Oceans changes occur regularly. When someone adds a new node that touches the admin boundary or when someone modifies the coastline, it gets flagged. What are the specific tags that are priority for admin boundary and oceans.

Also, one feature to look out for are beaches. There are lots of cases where beach tag is added randomly. Will that tag qualify for something rare?

I am going to continue looking out for the changesets of this comparator, just so we will get a solid idea of how we want to improve this. Will add notes on the findings as and when.

/cc: @manoharuss @amishas157 @bkowshik

bsrinivasa commented 7 years ago

Admin boundaries and Oceans changes occur regularly. When someone adds a new node that touches the admin boundary or when someone modifies the coastline, it gets flagged. What are the specific tags that are priority for admin boundary and oceans.

@poornibadrinath We don't update either of them in Mapbox streets regularly. (either-Admin boundaries and coast line). So we would guess this is not a priority and should be removing from our detections/flagging them as suspicious. Is this correct @manoharuss

poornibadrinath commented 7 years ago

@bsrinivasa agreed. Since they are not updated regularly on Mapbox Streets and since changes on them do occur frequently on OSM, flagging them for rare and critical features would make no sense. Until there is some value in finding changesets regarding boundaries and oceans, we can remove them from the suspicious list. /cc: @manoharuss

krishnanammala commented 7 years ago

We don't update either of them in Mapbox streets regularly. (either-Admin boundaries and coast line). So we would guess this is not a priority and should be removing from our detections/flagging them as suspicious. Is this correct @manoharuss

Yes I agree we are not worried about admin boundaries and coastlines as they doesn't show up on Streets. Thats the reason we have removed them from the straight detector too.

A few changesets are having name tags being added. Would it be valuable to flag them?

This is also similar to case of above ^^ . Refer ticket https://github.com/mapbox/osm-quarantine/issues/256

cc @poornibadrinath @manoharuss

amishas157 commented 7 years ago

Thanks a lot @poornibadrinath for digging into this. Couple of queries and answers to your questions:

Are we gonna flag every detail being added? A few changesets are having name tags being added. Would it be valuable to flag them?

This compare function is meant to catch the newly created features and not deletion and modification of old ones. There whichever satisfies these filter and are newly created will be caught.

Is flagging mountain ranges important? Many times, they are mapping the ones that are not mapped. The ones I have noticed so far has been like that. Can we tweak it a bit to generate less noise.

Mountains became part of this compare function for two main reasons:

If the mountain keys are removed, it is very rare that you get any changesets regarding rare and critical features. Any other features that are really critical and less added to be included in this list?

Yes, we need to figure those features. Happy to extend this list if anything comes up and looks like that satisfies the objective of this compare function.

As Wille suggested, random castles being added on the map has a higher priority. Also, are view points and monuments important? Right now they are being tagged on maps.me filter. But would it make sense it ramp it up a bit. Are they critical and rarely added?

Yes, I agree that random castles added on map which is a rare kind of feature but not critical. Therefore looks like it does not fall in the scope of this compare function. But we can definitely take a stab at it in a separate compare function ?

Question: Admin boundaries and Oceans changes occur regularly. When someone adds a new node that touches the admin boundary or when someone modifies the coastline, it gets flagged. What are the specific tags that are priority for admin boundary and oceans.

I would like to clarify here that this compare function won't flag if a admin boundary or ocean is modified or deleted, but will only flag when either of them is created (version = 1)

Also, one feature to look out for are beaches. There are lots of cases where beach tag is added randomly. Will that tag qualify for something rare?

Yes, that tag would identify as both rare and critical. But the problem which could arise is same as that of mountains. As in there is no clear distinction between important and not so important beaches. There are two options which we can look forward to:

I am going to continue looking out for the changesets of this comparator, just so we will get a solid idea of how we want to improve this. Will add notes on the findings as and when.

🙇 Let me know something is unclear in above. ✌️

cc @bkowshik @manoharuss

poornibadrinath commented 7 years ago

This is super detailed! @amishas157, thanks for the explanation. Next actions here would be:

This looks like a good place to start

Will be happy to help you on this :)