mapbox / gabbar

Guarding OpenStreetMap from harmful edits using machine learning
MIT License
19 stars 7 forks source link

Feature engineering #39

Closed bkowshik closed 7 years ago

bkowshik commented 7 years ago

Working on https://github.com/mapbox/gabbar/issues/37

bkowshik commented 7 years ago

Training dataset

The Area Under the Curve (AUC) represents a model’s ability to discriminate between positive and negative classes. An area of 1.0 represents a model that made all predictions perfectly. An area of 0.5 represents a model that is as good as random.

Validation dataset

Predicted good Predicted harmful
Labelled good 1750 151
Labelled harmful 541 140
             precision    recall  f1-score   support

      False       0.76      0.92      0.83      1901
       True       0.48      0.21      0.29       681

avg / total       0.69      0.73      0.69      2582

Testing dataset