jiangdongguo / ChitChatAssistant

(已不维护)Rasa中文聊天机器人
http://blog.csdn.net/andrexpert
534 stars 149 forks source link

我用mitie训练模型没报错为什么生成不了文件 #19

Open shengyaokai opened 4 years ago

shengyaokai commented 4 years ago

执行python -m rasa train --config configs/config.yml --domain configs/domain.yml --data data/命令 terminal: Processed trackers: 100%|█████████████████████████████████████████████████████████████████████████████████ ████████████████████████████████| 64/64 [00:00<00:00, 219.91it/s, # actions=128] 2020-06-10 09:53:33 INFO rasa.core.agent - Persisted model to 'C:\Users\ADMINI~1\AppData\Local\Temp\tmpslhpch1h\core' Core model training completed. Training NLU model... 2020-06-10 09:53:38 INFO rasa.nlu.components - Added 'MitieNLP' to component cache. Key 'MitieNLP-D:\ChitChatAssistant-master\data\total_word_feature_extractor.dat'. 2020-06-10 09:53:38 INFO rasa.nlu.tokenizers.jieba_tokenizer - Loading Jieba User Dictionary at data/dict\userdict.txt Building prefix dict from the default dictionary ... Loading model from cache C:\Users\ADMINI~1\AppData\Local\Temp\jieba.cache Loading model cost 0.770 seconds. Prefix dict has been built successfully. 2020-06-10 09:53:39 INFO rasa.nlu.training_data.training_data - Training data stats: 2020-06-10 09:53:39 INFO rasa.nlu.training_data.training_data - Number of intent examples: 309 (13 distinct intents) 2020-06-10 09:53:39 INFO rasa.nlu.training_data.training_data - Found intents: 'inform', 'greet', 'stop', 'goodbye', 'inform_business', 'chitchat', 'whattodo', 'affirm', 'thanks', 'request_weather', 'request_number', 'deny', 'whoareyou' 2020-06-10 09:53:39 INFO rasa.nlu.training_data.training_data - Number of response examples: 0 (0 distinct responses) 2020-06-10 09:53:39 INFO rasa.nlu.training_data.training_data - Number of entity examples: 155 (5 distinct entities) 2020-06-10 09:53:39 INFO rasa.nlu.training_data.training_data - Found entity types: 'business', 'type', 'date_time', 'number', 'address' 2020-06-10 09:53:39 INFO rasa.nlu.model - Starting to train component MitieNLP 2020-06-10 09:53:39 INFO rasa.nlu.model - Finished training component. 2020-06-10 09:53:39 INFO rasa.nlu.model - Starting to train component JiebaTokenizer 2020-06-10 09:53:39 INFO rasa.nlu.model - Finished training component. 2020-06-10 09:53:39 INFO rasa.nlu.model - Starting to train component MitieEntityExtractor C:\Users\Administrator\AppData\Local\Programs\Python\Python36\lib\site-packages\rasa\utils\common.py:351: UserWarning: Failed to use example '查询电话19862618425' to train MITIE entity ex tractor. Example will be skipped.Error: Invalid entity {'start': 2, 'end': 4, 'value': '电话', 'entity': 'type'} in example '查询电话19862618425': entities must span whole tokens. Wrong e ntity start. C:\Users\Administrator\AppData\Local\Programs\Python\Python36\lib\site-packages\rasa\utils\common.py:351: UserWarning: Failed to use example '我想了解开房记录' to train MITIE entity extra ctor. Example will be skipped.Error: Invalid entity {'start': 4, 'end': 8, 'value': '开房记录', 'entity': 'business'} in example '我想了解开房记录': entities must span whole tokens. Wrong entity start. Training to recognize 5 labels: 'type', 'number', 'business', 'date_time', 'address' Part I: train segmenter words in dictionary: 200000 num features: 271 now do training C: 20 epsilon: 0.01 num threads: 1 cache size: 5 max iterations: 2000 loss per missed segment: 3 C: 20 loss: 3 0.931217 C: 35 loss: 3 0.931217 C: 20 loss: 4.5 0.94709 C: 5 loss: 3 0.936508 C: 20 loss: 1.5 0.925926 C: 18.1395 loss: 5.98842 0.94709 C: 25.2331 loss: 5.03726 0.94709 C: 17.8806 loss: 4.50912 0.94709 C: 18.709 loss: 4.85208 0.94709 C: 20 loss: 4.35 0.94709 C: 21.7862 loss: 4.44297 0.94709 C: 20.5424 loss: 4.41599 0.941799 best C: 20 best loss: 4.5 num feats in chunker model: 4095 train: precision, recall, f1-score: 0.989529 1 0.994737 Part I: elapsed time: 38 seconds.

Part II: train segment classifier now do training num training samples: 191 C: 200 f-score: 0.98081 C: 400 f-score: 0.98081

应该这样生成成功了吧 但是在models文件中没有生成文件

jiangdongguo commented 4 years ago

指定model参数试试

shengyaokai commented 4 years ago

@jiangdongguo 我项目启了起来 训练的时间太长了 我以为没生成成功 差不多训练一个小时不到点吧