dotnet / machinelearning

ML.NET is an open source and cross-platform machine learning framework for .NET.
https://dot.net/ml
MIT License
8.95k stars 1.86k forks source link

Confidence when using text classification #5743

Open PeterKottas opened 3 years ago

PeterKottas commented 3 years ago

I am developing an intent classification model using the text classification pipeline. The model simply predicts intent based on a question provided (that comes from the user). Things work relatively well when the phrases are close to the training set but I start getting some weird random results for totally unrelated terms. Looking at scores, these appear to be normalized so they add up to 1. That means that I can get a relatively high score (often around +-0.8) for a phrase that have nothing to do with any question from the training set. I am wondering if: a) Is there a way to get unnormalized scores, e.g. the sum of the scores is <0,1> where the closer it is to 1, the higher confidence there is. b) General confidence value that would simply tell me if I should trust the output of the prediction?

I was considering creating 2 models, one with a bunch of random text that is labeled as 0 and the actual training phrases labeled as 1. The other model with just training phrases. The first model would then give me confidence, while the other one would give me the actual result. This however seems like a huge overkill, especially considering that I would probably have to include thousands of text samples carefully picked not to resemble actual text phrases ... Seems awful but nothing else comes to mind. Anybody approached this in a more reasonable way?

PeterKottas commented 3 years ago

:hourglass_flowing_sand: It gets quiet in here :)

Anyways, this is what I did for now, but bear in mind this is quite specific for my knowledge-base kind of AI.

1. Predict label with ML.NET
2. Compare score to a threshold
3. Fetch all questions for predicted answer
4. Check all of these with [fuzzy matching](https://github.com/JakeBayer/FuzzySharp) 
5. Get the best score and compare this score with another different threshold

I am still very much looking forward to hearing a better solution because this is IMHO quite hacky.