mapbox / gabbar

Guarding OpenStreetMap from harmful edits using machine learning
MIT License
19 stars 7 forks source link

Using results from osm-compare as features #18

Closed bkowshik closed 7 years ago

bkowshik commented 7 years ago

In osm-compare, we have a mechanism to add interesting labels to changesets on osmcha. Ex: A null_island comparator labels a changeset as Feature near Null Island when a changesets has one of it's features very close to [0, 0].

In our last sync, @anandthakker and me discussed the idea of using the results from these compare functions as features to train the machine learning model. I personally think this will be :boom:! With this PR, we explore this idea.


precision           recall  f1-score   support

problematic         0.75      0.00      0.01      1731
not problematic     0.91      1.00      0.95     17488

avg / total         0.90      0.91      0.87     19219

index

anandthakker commented 7 years ago

@bkowshik this is an exciting prospect!

Looks like the first try is failing to produce many true positives (i.e. very low recall score for problematic). I wonder if balancing/weighting the input data might help here... Maybe check out the class_weight parameter to SVC?

bkowshik commented 7 years ago

The way how different classifier and classifier parameters are laid in wiki-ai/editquality.

bkowshik commented 7 years ago

I wonder if balancing/weighting the input data might help here... Maybe check out the class_weight parameter to SVC?

Found an interesting gist from a Wikimedia team member on testing multiple parameters.

bkowshik commented 7 years ago

We have moved over to a common notebook at the link below: