Closed bkowshik closed 7 years ago
In the scenario where all changesets predicted to be potentially harmful on osmcha
are 👀 by users on osmcha
, I was wondering, 💭 if we could calculate the hit rate as follows.
1,201
1,964
3,165
37.95%
😬 i.e: If 100 changesets labelled by the current model as potentially problematic are manually 👀, we should potentially find 37
changesets to be actually problematic.
NOTE: Posting here for feedback on if this is the right way to measure Hit Rate
.
The model parameters that yielded the best model performance are:
{
"probability": true,
"C": 10000,
"gamma": "auto",
"cache_size": 800,
"class_weight": "balanced",
"kernel": "rbf"
}
NOTE: Have updated this post to reflect new performance numbers and graph.
Tweaking model parameters seems to have quite a lot of impact on the results. The results are looking so much better when compared to the previous run.
0.8382
potentially harmful
predictions from the model,1775
changesets5 minutes
.Current best model parameters for the
SVC
model:Next actions
cc: @anandthakker