TonyNemo / UBAR-MultiWOZ

AAAI 2021: "UBAR: Towards Fully End-to-End Task-Oriented Dialog System with GPT-2"
96 stars 25 forks source link

How to reproduce the results on multiwoz2.0 reported in your paper using the provided checkpoint? #4

Open lizekang opened 3 years ago

lizekang commented 3 years ago

I downloaded the provided checkpoint and tested it. But I couldn't reproduce the results on multiwoz2.0 reported in your paper.

inform success bleu score
96.3 91.1 22.26 115.96
TonyNemo commented 3 years ago

By reading the Readme file and following the steps.

lizekang commented 3 years ago

I follow the steps in the readme file but just get the results above using your checkpoint.

Red-Liu199 commented 3 years ago

Has this problem been solved? I evaluated using the response generation setting and got the same results above: validation [CTR] match: 96.30 success: 91.10 bleu: 22.26 score: 115.96

TonyNemo commented 3 years ago

I have double-checked on my end.

311dada commented 3 years ago

I downloaded the provided checkpoint and tested it. But I couldn't reproduce the results on multiwoz2.0 reported in your paper.

inform success bleu score 96.3 91.1 22.26 115.96

same result!

SkyAndCloud commented 3 years ago

@TonyNemo