Closed shuvayan closed 7 years ago
There should be a statistic printed at the beginning when starting the training, can you post these results here (contains how many intents / entities + samples of each).
This is the full transcript of the messages thrown:
C:\Users\shuvayan.das\Downloads\Chatbot_Python\entityRecognition\RASA_NLU>python -m rasa_nlu.train -c config.json INFO:root:Trying to load spacy model with name 'en' INFO:root:Added 'nlp_spacy' to component cache. Key 'nlp_spacy-en'. INFO:root:Training data format at ./data/trainData.json is rasa_nlu INFO:root:Training data stats:
INFO:root:Starting to train component nlp_spacy
INFO:root:Finished training component.
INFO:root:Starting to train component ner_spacy
INFO:root:Finished training component.
INFO:root:Starting to train component ner_synonyms
INFO:root:Finished training component.
INFO:root:Starting to train component intent_featurizer_spacy
INFO:root:Finished training component.
INFO:root:Starting to train component intent_featurizer_ngrams
C:\Users\shuvayan.das\AppData\Local\Continuum\Anaconda3.3\lib\site-packages\rasa
_nlu\featurizers\ngram_featurizer.py:175: FutureWarning: in the future, boolean
array-likes will be handled as a boolean array index
sentences = np.array(sentences)[mask]
C:\Users\shuvayan.das\AppData\Local\Continuum\Anaconda3.3\lib\site-packages\rasa
_nlu\featurizers\ngram_featurizer.py:176: FutureWarning: in the future, boolean
array-likes will be handled as a boolean array index
labels = np.array(labels)[mask]
Traceback (most recent call last):
File "C:\Users\shuvayan.das\AppData\Local\Continuum\Anaconda3.3\lib\runpy.py",
line 184, in _run_module_as_main
"main", mod_spec)
File "C:\Users\shuvayan.das\AppData\Local\Continuum\Anaconda3.3\lib\runpy.py",
line 85, in _run_code
exec(code, run_globals)
File "C:\Users\shuvayan.das\AppData\Local\Continuum\Anaconda3.3\lib\site-packa
ges\rasa_nlu\train.py", line 83, in
It seems the error is in the sklearn liblinear model part and I have found this bug elsewhere : https://github.com/lensacom/sparkit-learn/issues/49
If this is the case I believe random shuffling of the records has to be implemented before feeding the data to the models.
Right, so this is somewhat tricky. During the training the data gets shuffled and split into multiple cross validation folds. It seems like the splitting creates partial data sets that only contain one of the classes. So you should add more examples, but we should also find away around this (reduce the number of splits?)
Yes adding more records will help but I saw somewhere that randomizing the records might help. I will try with more records and let you know the results.
Hello,
I am using the below code to train my model :
and this is the config file I am using:
However it throws an error saying:
Clearly this is not the case as my training data has more than 2 classes(of intent - buy & explore). A sample of the trainingData:
Can you please help regarding this.