jonathanpeppers / inclusive-code-reviews-ml

Machine learning for code reviews!
MIT License
14 stars 6 forks source link

New training algorithm + toxic comments #81

Closed jonathanpeppers closed 2 years ago

jonathanpeppers commented 2 years ago
.\train.ps1 -seconds 500
|                                              Top 5 models explored                                             |
------------------------------------------------------------------------------------------------------------------
|     Trainer                              MicroAccuracy  MacroAccuracy  Duration #Iteration                     |
|1    LightGbmMulti                               0.8818         0.7775      21.2          1                     |
|2    LinearSvmOva                                0.8817         0.7820       3.7          2                     |
|3    FastTreeOva                                 0.8816         0.7911      44.4          3                     |
|4    LinearSvmOva                                0.8807         0.7888      14.8          4                     |
|5    LinearSvmOva                                0.8800         0.7879       8.4          5                     |

After these changes, the model seems to be a bit better:

*************************************************************************************************************
*       Metrics for Multi-class Classification model
*------------------------------------------------------------------------------------------------------------
*       Average MicroAccuracy:     0.885  - Standard deviation: (.008)  - Confidence Interval 95%: (.008)
*       Average MacroAccuracy:     0.786  - Standard deviation: (.014)  - Confidence Interval 95%: (.014)
*       Average LogLoss:           .355  - Standard deviation: (.032)  - Confidence Interval 95%: (.031)
*       Average LogLossReduction:  .324  - Standard deviation: (.04)  - Confidence Interval 95%: (.04)
*       Average Class 0 Precision: 0.898  - Standard deviation: (.012)  - Confidence Interval 95%: (.012)
*       Average Class 1 Precision: 0.819  - Standard deviation: (.047)  - Confidence Interval 95%: (.046)
*************************************************************************************************************

I also added a couple hundred rows from toxic-comments.csv.