smartyfh / MultiWOZ2.4

MultiWOZ 2.4: A Multi-Domain Task-Oriented Dialogue Dataset
MIT License
60 stars 7 forks source link

Could you please provide the results of STAR on MultiWOZ 2.4? #3

Open couragelfyang opened 3 years ago

couragelfyang commented 3 years ago

Thanks for your very good work and it is helpful. But the result of my rerunning the STAR on MultiWOZ 2.4 is 72.39% which is lower than reported in the paper. I think it may be caused by some hardware environment. Could you please upload the result (like exp.txt) for a fair comparison?

smartyfh commented 3 years ago

Thanks for your very good work and it is helpful. But the result of my rerunning the STAR on MultiWOZ 2.4 is 72.39% which is lower than reported in the paper. I think it may be caused by some hardware environment. Could you please upload the result (like exp.txt) for a fair comparison?

Here you go.

Results based on best acc: {'epoch': 1, 'loss': 6.586433454652304, 'joint_acc': 0.7361563517915309, 'joint_turn_acc': 0.9044516829533116, 'slot_acc': array([0.98561346, 0.97366992, 0.97896308, 0.98357763, 0.9987785 , 0.99796417, 0.99837134, 0.97882736, 0.97801303, 0.98425624, 0.98927796, 0.99375679, 0.96335505, 0.9919924 , 0.99647123, 0.99728556, 0.99552117, 0.99212812, 0.9655266 , 0.98914224, 0.99294245, 0.98927796, 0.98968512, 0.99402823, 0.99307818, 0.99484256, 0.99714984, 0.99389251, 0.99280673, 0.98384908]), 'ave_slot_acc': 0.9884681505609845, 'final_joint_acc': 0.6506506506506506, 'final_slot_acc': array([0.98198198, 0.95495495, 0.97397397, 0.98098098, 0.998999 , 0.99499499, 0.998999 , 0.97097097, 0.96396396, 0.97897898, 0.98698699, 0.99299299, 0.96296296, 0.98998999, 0.99499499, 0.996997 , 0.99299299, 0.99199199, 0.94694695, 0.98798799, 0.98498498, 0.98098098, 0.98098098, 0.98798799, 0.98898899, 0.99199199, 0.99499499, 0.99399399, 0.99399399, 0.97397397]), 'final_ave_slot_acc': 0.9838838838838838}

couragelfyang commented 3 years ago

thanks a lot