Evaluate HSM Model - Githubissues

TiffanyAndrews commented 4 years ago

10x Qualitative Data User story

As a new ML engineer, I want the HSM running on my local machine so that I can evaluate model performance .

Acceptance criteria

[ ]Check the accuracy metric of the ML model on Qualtrics data for know the target answer
[ ]use this assessment as a proxy for predictive accuracy on future data
[ ]determine if HSM in the current state will do a good job of predicting the target on new and future data
[ ] Identify what modifications are needed to improve the HSM model( if needed)

csmcallister commented 4 years ago

Hi! Original model developer here. Glad to see this project still has wind in its sails!

Just wanted to note that accuracy might not be the best metric for evaluating this model's performance. iirc, there were very few instances of spam (~10%), so a model that always predicts not spam would be 90% accurate. A better metric might instead be average precision, which I believe is what I was using to train, or the F Beta score, which balances both precision and recall, two metrics that are more sensitive to class imbalance. It all depends on what sort of error (e.g. false positives) you want to minimize. A fuller overview of classification metrics within sklearn is here: https://scikit-learn.org/stable/modules/model_evaluation.html#classification-metrics

TiffanyAndrews commented 4 years ago

@AdamGerberGSA

Focus for this week is to examine David's annotation
Link up the annotation to the data
Write a test to compare and provide a score

18F / 10x-MLaaS

Evaluate HSM Model #118

10x Qualitative Data User story

Acceptance criteria