I found the evaluation script from the PLATO,
I want to know if you use the same script,
and where I can find the evaluation script for AVG that used in DailyDialog and PersonaChat.
And How you get the AVG results for other models, for example PLATO, did you reproduce their experiment?
I found the evaluation script from the PLATO, I want to know if you use the same script, and where I can find the evaluation script for AVG that used in DailyDialog and PersonaChat. And How you get the AVG results for other models, for example PLATO, did you reproduce their experiment?