Closed sherlock1987 closed 4 years ago
Hi, for this moment, the GDPL model has slight improvement over the pretrained MLE model at the beginning epochs. However, the performance will drop later. We will solve this problem as soon as possible.
Thanks Bro
Is there any clue? We could fix this problem together. I believe the reward estimator has some problems, since loss func is based on that extimator.
Hey, is anyone start looking at this?
Hey, is anyone start looking at this?
Yes, I am working on it.
Cool!
move to #54
Describe the bug When I try to train the model of GDPL, also I loaded the MLE pretrained model, but the loss and results for evluation is always around 0.26. Below is the problem issue, could you guys help me out? Since GDPL is pretty good, and also I plan to set this as my baseline model.
To Reproduce
WARNING:root:illegal booking slot: time, slot: hotel domain WARNING:root:illegal booking slot: time, slot: hotel domain WARNING:root:illegal booking slot: time, slot: hotel domain WARNING:root:illegal booking slot: time, slot: hotel domain WARNING:root:illegal booking slot: time, slot: taxi domain DEBUG:root:<> epoch 0, loss_real:-0.5383382267836068, loss_gen:-1.5583195904683735
INFO:root:<> epoch 0: saved network to mdl
DEBUG:root:<> weight -3.7587242126464844
DEBUG:root:<> log pi -11.807324409484863
/home/raliegh/视频/convlab2_github_code_theirs/ConvLab-2/convlab2/policy/gdpl/gdpl.py:183: UserWarning: torch.nn.utils.clip_grad_norm is now deprecated in favor of torch.nn.utils.clip_gradnorm.
torch.nn.utils.clip_grad_norm(self.policy.parameters(), 10)
DEBUG:root:<
Thank you guys, have a good day! Appreciate your help.