Closed SuvodipDey closed 3 years ago
You can refer to https://github.com/microsoft/DialoGPT/tree/master/dstc and https://github.com/mgalley/DSTC7-End-to-End-Conversation-Modeling/tree/master/evaluation/src for more evaluation details. Also the issue (https://github.com/microsoft/DialoGPT/issues/48) in DialoGPT is also helpful.
Thanks for sharing the evaluation method. One more clarification regarding the evaluation of the DailyDialog dataset. As per the paper, it seems that you have used the multi-ref daily dialog dataset to report the result. Please let me know if this is the correct link to the dataset. Also, did you use all the five reference responses to compute the metrics?
yes, it's the correct link. We use all the five reference responses.
Thanks a lot.
First of all, kudos for this nice work. I really liked your work. I am trying to reproduce the results of your paper. It will be very helpful if you could share the evaluation script for the automated metrics. In the paper, it is written that "We employ the evaluation scripts used by DialoGPT.". Could you please point out the DialoGPT file used for your evaluation.