Intent classification with garbage text

Aljumaili85 commented 1 year ago

Make sure the issue is NLU related

[X] I confirm that the reported bug concerns the NLU independently of Botpress

Operating system

Windows

Product used

NLU Server

Deploy Option

Binary

Version

12.30.6

Configuration File

No response

CLI Arguments

No response

Environment variables

No response

Description of the bug

Hi there, I am using the NLU engine as a simple intent classifier, so I train the model using very simple utterances with no slots and no entities. I notice that when I use, as input, a simple phrase containing only the main keywords, the returned result is excellent, but when I add some garbage text to my phrase, the confidence dramatically goes down.

With garbage text, I mean "please, I would like ... " or "can you please ..." etc. I assume that the NLU is capable of identifying the lables for these words, but unfortunately, adding these words drops down the confidence of the returned results. Example:

"opening hours of the swimming pool " --> result { "intent" : "info_swimming_pool", "confidence": "0.9576875041845723" "please, I would like to know the opening hours of the swimming pool" --> result { "intent" : "info_swimming_pool", "confidence": "0.2959758827595158"

The language used in my case is Italian, and my Language server is running with no problem, and the training process was successful. any suggestion?
Thanks

franklevasseur commented 1 year ago

Hi @Aljumaili85,

I have a couple of questions:

Regardless of the confidence numbers, was the correct intent identified?
Where do these confidence numbers originate? Do you directly call the NLU server or do you utilize it within a Botpress v12 flow using event.nlu?

If the correct intent was identified and the confidence numbers you are observing are derived from event.nlu within a Botpress flow, then there's no need to worry. The confidence scores may not hold significant meaning in this context.

Frank

Aljumaili85 commented 1 year ago

Hi @franklevasseur, Thanks for your reply,

Regardless of the confidence numbers, was the correct intent identified?

Yes, the intent was right in all cases, simply because I have no other intent.

Where do these confidence numbers originate? Do you directly call the NLU server or do you utilize it within a Botpress v12 flow using event.nlu?

I tried both cases: the botpress and the standalone (the latest binary version 1.0.2). can't guess which one is giving more accurate results because different models give different results. I notice that when I retrain using the same intent with the same utterances, the new model gives different confidence compared to the old one, which is acceptable if there is a 1% or 2% difference,but sometimes I get even a 20% difference!

Command to run the language server : nlu-v1_0_2-win-x64.exe lang --offline --dim 300 --langDir path/to/ItalianlanguageFiles

Commnd to run the NLU server: nlu-v1_0_2-win-x64.exe nlu

intent JSON :

{
    "language" : "it",
    "intents": [{
        "name":"orario_piscina",
        "contexts":["global"],
        "utterances":[
            "qual è l'orario di apertura della piscina",
            "qual è l'orario della piscina coperta",
            "l'orario di apertura della piscina scoperta",
            "l'orario della piscina",
            "quali sono gli orari di apertura della piscina coperta",
            "quali sono gli orari della piscina scoperta",
            "gli orari di apertura della piscina",
            "gli orari della piscina coperta",
            "a che ora è aperta la piscina scoperta",
            "a che ora è chiusa la piscina",
            "a che ora apre la piscina coperta",
            "quando apre la piscina scoperta",
            "quando aprono le piscine",
            "quando posso andare a nuotare",
            "nuotare in piscina",
            "quanto rimangono aperte le piscine coperte",
            "in quali orari posso andare a nuotare",
            "a che ora apre la piscina domani mattina",
            "a che ora chiude la piscina oggi",
            "a che ora chiudono le piscine della nave",
            "a che ora dovremo andare in piscina",
            "a che ora posso andare in piscina",
            "fino a che ora sono aperte le piscine",
            "la piscina a che ora apre",
            "la piscina coperta a che ora chiude"
          ],
        "slots":[] 
    }
],
    "contexts": ["global"],
    "entities":[]

}

Thank you!

franklevasseur commented 1 year ago

Hello again,

I have a few points to address:

The NLU Server has primarily been tested with approximately 5 to 10 intents, each consisting of 5 to 10 utterances. Your situation of having only one intent deviates from the typical use case.
It's worth noting that the NLU Server is considerably outdated. Its engine was developed in 2018, utilizing technologies that have been available since 2016. Consequently, it falls significantly short of producing GPT-like results.
Retraining the model will yield different outcomes. Typically, increasing the volume of training data helps reduce variability. If you desire consistent results, you can specify a training seed in the train request body.

Instead of relying on the confidence number (which has little value), you can try adding more intents and see if the accuracy is good enough.

You might also be interested in Botpress Cloud, which takes advantage of large language models.

I hope this information is helpful.

Best regards, Frank

Aljumaili85 commented 1 year ago

Thank you @franklevasseur . I appreciate this info. I will try to add more intent and check again. after this explination I think there is no need to leave it as open issue. Thanks again

botpress / nlu