RasaHQ / rasa

💬 Open source machine learning framework to automate text- and voice-based conversations: NLU, dialogue management, connect to Slack, Facebook, and more - Create chatbots and voice assistants
https://rasa.com/docs/rasa/
Apache License 2.0
18.81k stars 4.62k forks source link

Bug in system #439

Closed ghost closed 7 years ago

ghost commented 7 years ago

I use Python3.5 And make such training example

{
  "rasa_nlu_data": {
    "common_examples": [
      {
        "text": "What category of assets does the American Century One Choice 2025 C fund hold?",
        "entities": [],
        "intent": "getFundAssetType"
      },
      {
        "text": "Asset alloaction of American Century One Choice 2025 C",
        "entities": [],
        "intent": "getFundAssetType"
      },
....

and got

/home/vshebuniayeu/anaconda3/lib/python3.5/site-packages/sklearn/metrics/classification.py:1113: UndefinedMetricWarning: F-score is ill-defined and being set to 0.0 in labels with no predicted samples.
  'precision', 'predicted', average, warn_for)

Because of these bug, intent classification is working wrong

tmbo commented 7 years ago

This just indicates that you don't have enough training data. How do you experience the "... intent classifcation is working wrong"?

ghost commented 7 years ago

I used for intent classification, the same sentence that was in training set, and the result was totally wrong, the correct answer was only at 6 position by probability.

wrathagom commented 7 years ago

As Tom said above it appears that you are using too few training examples. In this case the model created from the training data will be of little to no use. Increase the amount of training data.

This isn't a lookup, so using the same sentence doesn't guarantee a perfect match. The sentence has to be processed by the model, and if the model wasn't generated correctly (because of too few training examples) then it will not classify the intent correctly.

tmbo commented 7 years ago

Great answer :+1:

ghost commented 7 years ago

In your framework, there is not any test result, and i didn't find any notes, about what is the accuracy of your intent recognition. I already increase quantity of training data, but the result is very bad.

tmbo commented 7 years ago

Are you still seeing the warning message you posted?

ghost commented 7 years ago

No, but intent classification, is very bad, smth i try all question from train data, but classification is wrong

wrathagom commented 7 years ago

can you post your training data in a gist and paste the link back here?

ghost commented 7 years ago

@wrathagom You are not right "As Tom said above it appears that you are using too few training examples. In this case the model created from the training data will be of little to no use. Increase the amount of training data.

This isn't a lookup, so using the same sentence doesn't guarantee a perfect match. The sentence has to be processed by the model, and if the model wasn't generated correctly (because of too few training examples) then it will not classify the intent correctly." First of all it's depend on what kind of classifier you use, some of them required more training data ,some of them not. And if classifier is bad, increading training data will not work. It's junior issues in Data Science. And you are talking only Statistical Models, if your system use Grammatical models, you don't need a lot of data,

And you still don't want to answer to my question, if you use statisticals models, where is your test of accuracy of prediction of intent, as usuall you should have training data, cv data, and test data, and based on test ,show in percentage what is the accuracy.Please provide it?

wrathagom commented 7 years ago

The Rasa guys implementation of sklearn for classification uses a support vector machine, more documentation can be found here http://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html

During a normal training run that succeeds the output looks like this:

rasa_1             | INFO:root:Starting model training
rasa_1             | INFO:root:Training process <Process(Process-1, started)> started
rasa_1             | 172.19.0.5 - - [2017-06-26 22:13:53] "POST /train?name=test_2017-06-26T221353.530Z HTTP/1.1" 200 188 0.020378
rasa_1             | INFO:root:Trying to load spacy model with name 'en'
rasa_1             | INFO:root:Added 'nlp_spacy' to component cache. Key 'nlp_spacy-en'.
rasa_1             | INFO:root:Training data format at /tmp/tmpX4jlsy_training_data.json is rasa_nlu
rasa_1             | INFO:root:Training data stats:
rasa_1             |    - intent examples: 253 (4 distinct intents)
rasa_1             |    - found intents: affirm, goodbye, greet, restaurant_search
rasa_1             |    - entity examples: 222 (2 distinct entities)
rasa_1             |    - found entities: cuisine, location
rasa_1             |
rasa_1             | INFO:root:Starting to train component nlp_spacy
rasa_1             | INFO:root:Finished training component.
rasa_1             | INFO:root:Starting to train component ner_crf
rasa_1             | INFO:root:Finished training component.
rasa_1             | INFO:root:Starting to train component ner_synonyms
rasa_1             | INFO:root:Finished training component.
rasa_1             | INFO:root:Starting to train component intent_featurizer_spacy
rasa_1             | INFO:root:Finished training component.
rasa_1             | INFO:root:Starting to train component intent_classifier_sklearn
rasa_1             | [Parallel(n_jobs=1)]: Done  12 out of  12 | elapsed:    0.2s finished
rasa_1             | INFO:root:Finished training component.
rasa_1             | INFO:root:Successfully saved model into '/usr/src/rasa_nlu/models/test_2017-06-26T221353.530Z'
rasa_1             | Fitting 2 folds for each of 6 candidates, totalling 12 fits

and parsing against that model would return something like this:

    {
        "domain": "test",
        "entities": [],
        "intent": {
            "confidence": 0.8322380463350821,
            "name": "greet"
        },
        "intent_ranking": [
            {
                "confidence": 0.8322380463350821,
                "name": "greet"
            },
            {
                "confidence": 0.09311768996416162,
                "name": "affirm"
            },
            {
                "confidence": 0.0620525211985934,
                "name": "goodbye"
            },
            {
                "confidence": 0.01259174250216282,
                "name": "restaurant_search"
            }
        ]
    }

If sklearn performs a self test the results aren't exposed via the Rasa API.

Training Data is provided here: https://github.com/RasaHQ/rasa_nlu/blob/master/data/examples/rasa/demo-rasa.json and I can confirm that Rasa with multiple pipeline configurations is 100% accurate in parsing it.

tmbo commented 7 years ago

And you still don't want to answer to my question, if you use statisticals models, where is your test of accuracy of prediction of intent, as usuall you should have training data, cv data, and test data, and based on test ,show in percentage what is the accuracy.Please provide it?

We don't provide pre-trainined intent / entity models. Hence, you need to evaluate your models yourself (we can not provide performance metrics for them).

ghost commented 7 years ago

@wrathagom you wrote " I can confirm that Rasa with multiple pipeline configurations is 100% accurate in parsing it." I was talking not about parsing accuracy, but about accuracy of prediction.

amn41 commented 7 years ago

@vladimircape I would just remind you of our code of conduct - please refrain from statements like "It's junior issues in Data Science"

PHLF commented 7 years ago

@vladimircape People here take time to help you. You may not be happy with the answers you get but please show a little respect.

ghost commented 7 years ago

@PHLF Why do you think that people want to help me, I show them problems in their system. I asked simple question of Data Science area, like what is the accuracy of your intent classifier, and author don't understand the question and start to wrote about other issue which i didn't ask. I don't want to lie, it's a junior question of Data Science, if author will not understand them, yes they can delete the topics, but they will make bad for themself, because other users will start use their library also find such problem and just remove rasa library. As you see , i didn't receive answer, but author check that the bug is closed. This is not polite, why should i be polite. I didn't sign code of conduct

PHLF commented 7 years ago

No of course you're free to do anything you want and be aggressive but on the other hand just expect not to be answered. As for your statement:

Why do you think that people want to help me, I show them problems in their system.

@wrathagom is not member of the company building RASA (neither do I), so yes he takes on his spare time to answer you and I can understand him closing this topic facing such a childish attitude. Still not satisfied? Go ahead and put some of your "Data Science expertise" into a product of your own outperforming RASA: that's what open source is for.

wrathagom commented 7 years ago

@vladimircape of course we are happy to help, but in order to receive help I believe @amn41 would like everyone to abide by the Code of Conduct. Seeing as this discussion topic isn't progressing anywhere I am inclined to lock it, but I would like to provide one more chance the opportunity for us to assist you.

Rasa is bringing multiple tools together in order to offer functionality similar to API.ai and other "chatbot" providers. One of the tools they include is sklearn. The message you are seeing is actually just being passed by Rasa from sklearn metrics. Some more reading information if you'd like to dive in further.

This is not a bug, but is in fact a feature of the underlying tools telling you that something is not right with the model that has been generated.

@tmbo and my response is to say that: in the past whenever we have seen this warning, it was a strong indicator that the training data was insufficient.

If you'd like to share your training data with us either here, in a gist, or privately via e-mail we'd be happy to take a look. You may also have better luck using the mitie pipeline which qualitatively requires less training data, but tends to be slower to train.

As for Rasa itself, the GoLastMile team has managed to get 1500+ stars, 400+ unique cloners, and 14000+ viewers. I think they're doing wonderful work.

ghost commented 7 years ago

@wrathagom If you are not "is not member of the company building RASA" why you answer for question which belongs to RASA.

Instead of a thousand words, you could have done testing during this time, but apparently you have such a business model, a mass of crush

wrathagom commented 7 years ago

Locked, sorry we weren't able to help you.