With the current workflow, every time we have a new trained model, we generate two csv files for manual review:
Fifty unlabelled changesets predicted good
Another fifty unlabelled changesets predicted problematic
Current workflow
Sort changesets by descending order of Gabbar predictions.
Select top 50 rows - changesets with prediction 1, denoting problematic.
Select bottom 50 rows; changesets with prediction of 0, denoting good.
The challenge here is that changesets are by default ordered by changeset ID, thus we don't have a way to have good variety in the results for manual 👀
Let's automate this step, so that when notebook is run, changesets for manual review are automatically generated.
Per https://github.com/mapbox/gabbar/issues/43#issuecomment-307729045
With the current workflow, every time we have a new trained model, we generate two csv files for manual review:
Current workflow
The challenge here is that changesets are by default ordered by changeset ID, thus we don't have a way to have good variety in the results for manual 👀
Let's automate this step, so that when notebook is run, changesets for manual review are automatically generated.