truthless11 / GDPL

Task-oriented Dialog Policy Learning with Adversarial Inverse Reinforcement Learning
45 stars 7 forks source link

experimental setup #3

Closed Jeiyoon closed 4 years ago

Jeiyoon commented 4 years ago

hi, i read your paper GDPL(https://arxiv.org/abs/1908.10719) and ran your code (https://github.com/truthless11/GDPL) but even i tried many times, i cant get result what your team get in your paper

I also follow all your instructions written on README and closed issue and put right parameters into codes

and i got more than 60% success rate during the pretraining process (62%)

I dont know what is the problem

heres parameters during train process

Namespace(anneal=5000, batchsz=32, batchsz_traj=1024, clip=0.03, config='multiwoz', data_dir='data', epoch=16, epsilon=0.2, gamma=0.99, load='best0619/best', load_user='model_agenda/best', log_dir='log', lr_irl=0.0001, lr_rl=0.0001, lr_simu=0.001, pretrain=False, print_per_batch=400, process=16, save_dir='model_agenda', save_per_epoch=1, simulator='agenda', tau=0.95, test=False, update_round=5)

and i got 20% to 50% success rate even i tried so many times

so would you mind if you let me know more details for training like pretrain, train and test parameters and procedures

heres my training procedures. all parameters are same

load pretrained models and save trained models '--load', type=str, default='best0619/best' (pretrained model directory) '--save_dir', type=str, default='model_agenda'

load trained models for test '--load', type=str, default='model_agenda/best' (trained model directory)

thank you so much

Jeiyoon commented 4 years ago

1) pretrain result

INFO:root:reward 1.1038802235795007 INFO:root:turn 12.3 INFO:root:match 0.6692913385826772 INFO:root:inform rec 0.7343437231913068, F1 0.8145915939730374 INFO:root:success 0.619

2) test result

INFO:root:reward -1.0247540575394356 INFO:root:turn 16.217 INFO:root:match 0.49291338582677163 INFO:root:inform rec 0.6182442093222762, F1 0.7316412859560067 INFO:root:success 0.465

Jeiyoon commented 4 years ago

Would you mind if i ask you to release your pretrained models? i thought the results are highly dependent on pretrained model and I've maded many pretrained models over 60% success rate but they didnt work

Jeiyoon commented 4 years ago

oh I'm so sorry. I've put wrong parameters for each train prosedure

problem solved! thank you