18F / 10x-MLaaS

Repository for machine learning tool, MeL, that assist in providing insights for open text data. This tool is part of the 10x Machine Learning as a Service project (formerly known as Qualitative Data Management).
https://10x.gsa.gov
Other
13 stars 8 forks source link

Evaluate HSM Model #118

Open TiffanyAndrews opened 4 years ago

TiffanyAndrews commented 4 years ago

10x Qualitative Data User story

As a new ML engineer, I want the HSM running on my local machine so that I can evaluate model performance .

Acceptance criteria

csmcallister commented 4 years ago

Hi! Original model developer here. Glad to see this project still has wind in its sails!

Just wanted to note that accuracy might not be the best metric for evaluating this model's performance. iirc, there were very few instances of spam (~10%), so a model that always predicts not spam would be 90% accurate. A better metric might instead be average precision, which I believe is what I was using to train, or the F Beta score, which balances both precision and recall, two metrics that are more sensitive to class imbalance. It all depends on what sort of error (e.g. false positives) you want to minimize. A fuller overview of classification metrics within sklearn is here: https://scikit-learn.org/stable/modules/model_evaluation.html#classification-metrics

TiffanyAndrews commented 4 years ago

@AdamGerberGSA