mapbox / osm-compare

Functions that identify what changed during a feature edit on OpenStreetMap.
ISC License
38 stars 15 forks source link

Intelligent interface between consumers and comparators #140

Open bkowshik opened 7 years ago

bkowshik commented 7 years ago

@amishas157 and me braistormed on an interface between the comparators. With this setup we hope to:

Scenario

screen shot 2017-04-05 at 10 36 26 am

Current setup

Proposed setup

intelligent interface - osm-compare - validation


Would love to hear all your thoughts. cc: @geohacker @batpad

batpad commented 7 years ago

@bkowshik in this scenario, why can we not just have a utility function, that is something like isOnlyWikidataChange -- and then the osm-landmarks function checks, and if it is onlyWikiDataChange, stops further processing?

I'm a bit wary to build up these complex interwebs of relationships between compare function where each one needs to keep a track of its dependents and dependencies and I want to see if we can continue keeping compare functions single and atomic. I would imagine having a lot of small utility functions that are responsible for making these kinds of assertions about a change, and then the compare function be responsible for invoking them and stopping further processing if the change type matches certain conditions.

I think the other option would be something like:

function my CompareFunc(newVersion, oldVersion) {

)

myCompareFunc.exclude = [
  'wikidataEdit',
  'deletedFeature
];

and then the code that is responsible for running comparators checks the tags added to the feature against these properties defined by the function and decides whether to run it or not.

Not sure if there's another kind of architecture you have in mind.

I still prefer the approach of going with small utility functions and making it the responsibility of the comparator code to check whether this is something they are interested in or not by calling the utility functions.

But I can see the value in consolidating to an architecture where edits are "tagged" once as they come in, and there's some high level code that uses those "tags" to decide whether to call certain comparators or not. Happy to discuss more!

bkowshik commented 7 years ago

why can we not just have a utility function, that is something like isOnlyWikidataChange -- and then the osm-landmarks function checks, and if it is onlyWikiDataChange, stops further processing?

Let's take a scenario where the Wikidata tags of Eiffel Tower and Osteria Francescana are wrongly modified and it was just a Wikidata data change only. We want the osm-landmarks comparator to do the additional checks that it can do to to add more context and priority for the edits. In this case, Eiffel Tower being one of the wonders of the world could potentially be a more important feature to :eyes: before the restaurant. We won't get this additional context if osm-landmarks stops further processing. But, when the Wikidata changese are actually right, we don't want any additional context from osm-landmarks since this was just a Wikidata change.

batpad commented 7 years ago

Ok, so here's my proposal:

I think my initial thinking here was that the compare functions themselves would be responsible only for making these small inferences, like isNewMapper or isLandmark, etc. - and then it would be the responsibility of a module like pushToOsmcha to decide what should be regarded as "suspicious" and pushed as a suspicion (i.e. it could conceivably combine outputs of different detectors, etc.)

I'm definitely seeing a need to resolve the current problem we're facing - we essentially want to separate the task of "classifying" an edit, so-to-speak, based on the type of edit, or type of feature that was edited, or whatever -- and the task of "casting suspicions" about that edit, which very often is by using a combination of these different attributes that have been classified.

^ does all this make sense @bkowshik @ajithranka @amishas157 ? Let's voice a bit to make sure we're on the same page.

batpad commented 7 years ago

But, when the Wikidata changese are actually right, we don't want any additional context from osm-landmarks since this was just a Wikidata change.

@bkowshik could you expand more on "but, when the wikidata changes are actually right?" - so we'd need one function that checks if the edit was only to a wikidata tag -- and then another function that does things like query the wikidata APIs to try and figure out whether the change seemed correct?

bkowshik commented 7 years ago

Per voice with @batpad @amishas157 @ajithranka


There are use-cases when we want to just label features so that consumers can slice and dice through the labels to find interesting stuff. Ex: On osmcha, users can filter large lakes touched by new users. Then, there is the other case when we want to label features as either harmful or not harmful. So, how about we breakup up compare functions into two groups:

Taggers label features, Ex:

Flaggers use the taggers to label features as harmful or not harmful. Ex:


@geohacker

batpad commented 7 years ago

To expand a bit:

I hope this eases a lot of confusion around compare functions and whether they need to be "broad" or "narrowly scoped" and what aspects we should work on to reduce noise, etc. Taggers should be responsible to be pretty dumb classifiers - a lot of them will be simple booleans, but could also return values. Flaggers would simply be a bunch of if statements combining outputs of various flaggers to decide whether to treat an edit as suspicious or not.