Closed bkowshik closed 7 years ago
Curious to see the effect training size of the model has on the metrics, we have the following:
roc_auc
score is 0.8
with 6,000 samples, what would it look like with 10,000 samples?cc: @anandthakker
Before we had 8,620
labelled samples out of which 6,036
was used for training and 2,584
for validation. With the backfill done, we now have 10,165
out out which we use 7115
for testing and 3050
for validation.
1,545
new changesets to the labelled dump. 🎉 Interestingly, the nice upward graph now has become something like below. I don't understand why this is happening though.
We are 💯 to close here.
Ref https://github.com/mapbox/gabbar/issues/43
5,269
changesets for training our feature level classifier.4,000
changesets.Next actions
4,000
changesets - @bkowshikcc: @batpad @geohacker