snipsco / snips-nlu

Snips Python library to extract meaning from text
https://snips-nlu.readthedocs.io
Apache License 2.0
3.88k stars 515 forks source link

Low probability. How to debug/improve? #875

Open tadly opened 4 years ago

tadly commented 4 years ago

I've been reading through all the issues I could find and the two most notable findings are:

That said, I'm not quite sure how to validate the results of cross-val-metrics. I did read the wiki articles but still struggle to make sense of it. I do have parsing_errors (quite a lot actually) but don't know how to improve the dataset based on it.

Removing sentences also did not help and I've been rather picky about what to add.

My dataset is a export from dialogflow (I wrote a converter script which supports intents and entities). Within dialogflow I made sure there are no validation errors and the same query gives me a much higher confidence score then I get with snips. I assume the calculation approach is very different (though probably hard to tell due to the closed-source nature of dialogflow).

Here are some confidence results I get:

Dialogflow

Query In dataset Confidence
ceiling lights on yes 1
tv lights on no 0.79
tv on no 0.47 *

* I only just now tried this query and I am not sure how to feel about that result ':D

Snips

Query In dataset Confidence
ceiling lights on yes 1
tv lights on no 0.45
tv on no 1

The second entry worries me. A confidence below .5 is quite bad for a query so similar to one within the dataset. With a previous .fit it got as low as .32

Here is:

I hope someone can help and sorry if I forgot something important