Closed se4u closed 6 years ago
Hello @se4u
Maybe you should increase your hidden size because 12 is too small to generalize well (underfit). Also, increase a bit dropout rate will help. Please try.
@jasonwu0731 Thank you for your reply. I reran the experiment after setting hdd
to 50 and the training seems to work much better than before on Calendar and Weather domain but not Navigation.
08-10 13:15 Epoch:0
L:6.00, VL:4.30, PL:1.70: 100%|███████████████████████████| 3145/3145 [01:32<00:00, 34.04it/s]
08-10 13:16 STARTING EVALUATION
R:0.0925,W:74.8909: 100%|███████████████████████████████████| 389/389 [00:27<00:00, 13.91it/s]
08-10 13:17 F1 SCORE: 0.0008123476848090981
08-10 13:17 F1 CAL: 0.0
08-10 13:17 F1 WET: 0.002145922746781116
08-10 13:17 F1 NAV: 0.0
08-10 13:17 BLEU SCORE:2.57
08-10 13:17 MODEL SAVED
08-10 13:17 Epoch:1
L:4.97, VL:3.49, PL:1.48: 100%|███████████████████████████| 3145/3145 [01:29<00:00, 35.24it/s]
08-10 13:18 STARTING EVALUATION
R:0.0964,W:68.3815: 100%|███████████████████████████████████| 389/389 [00:29<00:00, 13.41it/s]
08-10 13:19 F1 SCORE: 0.06580016246953696
08-10 13:19 F1 CAL: 0.10122699386503067
08-10 13:19 F1 WET: 0.09871244635193133
08-10 13:19 F1 NAV: 0.004555808656036446
08-10 13:19 BLEU SCORE:4.76
08-10 13:19 MODEL SAVED
08-10 13:19 Epoch:2
L:4.63, VL:3.24, PL:1.40: 100%|███████████████████████████| 3145/3145 [01:28<00:00, 35.51it/s]
08-10 13:20 STARTING EVALUATION
R:0.0977,W:68.5270: 100%|███████████████████████████████████| 389/389 [00:29<00:00, 13.01it/s]
08-10 13:20 F1 SCORE: 0.19740048740861085
08-10 13:20 F1 CAL: 0.3128834355828221
08-10 13:20 F1 WET: 0.30257510729613735
08-10 13:20 F1 NAV: 0.0
08-10 13:20 BLEU SCORE:7.6
08-10 13:20 MODEL SAVED
08-10 13:20 Epoch:3
The command that I used with hdd=12
came directly from the README.
❱❱❱ python3 main_train.py -lr=0.001 -layer=1 -hdd=12 -dr=0.0 -dec=Mem2Seq -bsz=2 -ds=kvr -t=
@andreamad8 Can you please update the readme with the parameters used to obtain the results in the paper ?
Good to hear that. Let me modify the readme and close the issue.
@jasonwu0731 Thanks for updating the readme. Are these the same hyper-parameters that were used to generate the results in the paper ?
@jasonwu0731 Ah got it, thank you. I was looking at your ACL paper that is linked in the README. http://aclweb.org/anthology/P18-1136
I have one last question regarding the train.txt
file. How was it generated from the kvret_train_public.json
? Which file contains this preprocessing code ? And can you briefly summarize the format for train.txt
. I want to use your model on a different dataset.
@se4u
Ah we didnt upload the preprocessing code that we parsed from origin .json file. Please check the .json file since it includes many other information that we didnt use in our paper, such as requests and slots.
train.txt contains domain (start with #), KB information (start with 0) and dialog turns (start with 1,2,3,...).
Hi,
I am trying to reproduce your experiments, and just running the first command in the readme. My pytorch version is 0.3 as you can see below. I am evaluating after every epoch instead of just the first epoch. As you can see at the bottom of the log the model accuracy is close to
0
even after 8 epochs and the BLEU score is ~ 4.5.Is this expected behavior ?