MiuLab / SlotGated-SLU

Slot-Gated Modeling for Joint Slot Filling and Intent Prediction
303 stars 108 forks source link

Clarification on reproducing paper result #9

Closed Elkaito closed 4 years ago

Elkaito commented 4 years ago

Im am trying to reproduce the experiment for my thesis and i am having a hard time getting the same numbers.

In your paper, you report: "In all experiments ... the numbers are averaged over 20 runs". But based on the output file:

... 2020-03-09 19:52:52,220 : INFO : Epochs: 19 2020-03-09 19:52:52,220 : INFO : Loss: 0.02750968653033072 2020-03-09 19:52:53,346 : INFO : Valid: 2020-03-09 19:52:54,032 : INFO : slot f1: 97.74259747874524 2020-03-09 19:52:54,033 : INFO : intent accuracy: 96.6 2020-03-09 19:52:54,033 : INFO : semantic error(intent, slots are all correct): 89.60000000000001 2020-03-09 19:52:54,033 : INFO : Test: 2020-03-09 19:52:54,948 : INFO : slot f1: 95.40960451977402 2020-03-09 19:52:54,948 : INFO : intent accuracy: 95.63269876819709 2020-03-09 19:52:54,948 : INFO : semantic error(intent, slots are all correct): 84.5464725643897 2020-03-09 19:53:11,712 : INFO : Step: 5600 2020-03-09 19:53:11,714 : INFO : Epochs: 20 2020-03-09 19:53:11,714 : INFO : Loss: 0.027824909109663818 2020-03-09 19:53:12,100 : INFO : Valid: 2020-03-09 19:53:12,613 : INFO : slot f1: 97.48538011695906 2020-03-09 19:53:12,613 : INFO : intent accuracy: 97.39999999999999 2020-03-09 19:53:12,613 : INFO : semantic error(intent, slots are all correct): 89.8 2020-03-09 19:53:12,613 : INFO : Test: 2020-03-09 19:53:13,270 : INFO : slot f1: 95.26501766784452 2020-03-09 19:53:13,270 : INFO : intent accuracy: 95.40873460246361 2020-03-09 19:53:13,270 : INFO : semantic error(intent, slots are all correct): 83.87458006718926

i am confused which is the representative set of numbers for 1 run. Since an early-stop strategy is applied, do i understand correctly that the representative result is indeed the last output (marked in bold) which are then averaged over 20 runs ?

I would appreciate if somebody could kindly clarify.

Thanks!