mapbox / gabbar

Guarding OpenStreetMap from harmful edits using machine learning
MIT License
19 stars 7 forks source link

Automate preparation of changesets for manual review #52

Closed bkowshik closed 7 years ago

bkowshik commented 7 years ago

Per https://github.com/mapbox/gabbar/issues/43#issuecomment-307729045

With the current workflow, every time we have a new trained model, we generate two csv files for manual review:

  1. Fifty unlabelled changesets predicted good
  2. Another fifty unlabelled changesets predicted problematic

Current workflow

  1. Sort changesets by descending order of Gabbar predictions.
  2. Select top 50 rows - changesets with prediction 1, denoting problematic.
  3. Select bottom 50 rows; changesets with prediction of 0, denoting good.

The challenge here is that changesets are by default ordered by changeset ID, thus we don't have a way to have good variety in the results for manual 👀

Let's automate this step, so that when notebook is run, changesets for manual review are automatically generated.

bkowshik commented 7 years ago

Moving away from the csv based workflow towards: https://github.com/mapbox/gabbar/issues/47