reppy4620 / Dialog

A PyTorch Implementation of japanese chatbot using BERT and Transformer's decoder
MIT License
72 stars 29 forks source link

pretrained model's response is just only 'おはようございます' #14

Closed kpis-msa closed 4 years ago

kpis-msa commented 4 years ago

Dialog training has just finished. Then, the training log is as follows:

INFO:root: Initializing INFO:root:Preparing training data INFO:root:Define Models INFO:root:Define Loss and Optimizer INFO:root:Start Training Epoch: 1: 100%|██████████| 54882/54882 [9:39:27<00:00, 1.58it/s, Loss: 2.18444]
Epoch: 2: 100%|██████████| 54882/54882 [9:37:28<00:00, 1.58it/s, Loss: 2.54174]
Epoch: 3: 100%|██████████| 54882/54882 [9:31:07<00:00, 1.60it/s, Loss: 2.34356]
Saved Model おはようございます Saved Model おはようございます。 Saved Model おはようございます

Then I tried to test the trained 'ckpt.pth' file using run_eval.py. But, the responses of the trained file is as follows:

$ python3 run_eval.py You>おはよう BOT>おはようございます You>今日は疲れた BOT>おはようございます You>美味しいものを食べたい BOT>おはようございます

Responses are just only 'おはようございます'. What's wrong? Please let me know your thinking.

reppy4620 commented 4 years ago

Maybe, training file has a lot of similar lines e.g. (おはよう, おはよう!) and chatbot model is optimized with CrossEntropyLoss, so as mentioned in README, this model has still contain the problem about dull response. Actually, I met this issue that model always outputs "お疲れさま" when I input any sentences to the model. To trim data may make model quality better.