LSI classifier is opaque about quality of its guess

stevemadere commented 8 years ago

There is a lot of info about the quality of a classification guess in LSI which is hidden from the caller.

Sometimes, the guess is of very high quality and sometimes it's a toss-up. The caller has no way to distinguish them though.

It would be nice if there was an option to get not only the classification but a measure of the system's confidence.

I propose adding a method classify_with_confidence(text) which returns both the classification AND a measure of the confidence as a number between 0 and 1.0

i.e. guess, confidence = lsi.classify_with_confidence(text)

Then a caller could choose not to use any classification guesses whose confidence is below some threshold.

stevemadere commented 8 years ago

I'll submit a PR that purports to fix this.

cardmagic commented 4 months ago

merged

cardmagic / classifier

LSI classifier is opaque about quality of its guess #26