JoeZhouWenxuan / Self-regulation-Employing-a-Generative-Adversarial-Network-to-Improve-Event-Detection

event detection
23 stars 8 forks source link

Inconsistencies with the paper #2

Open DorianKodelja opened 6 years ago

DorianKodelja commented 6 years ago

In section 4.5 of the paper Ldiff is combined to the softmax loss using lambda to optimize Θd. As I understand it, it should be applied to Θg, not Θd since Ldiff is only a function of og and oghat In the code there are two lambdas: self.train_op_g = optimizer_g.minimize(self.g_loss + 0.1 * self.diff_loss, var_list=vars_g) self.total_loss = self.loss + self.l2_loss + self.diff_loss * 0.00001 and the diff loss is applied to g, not d. Is the code the proper version ?

DorianKodelja commented 6 years ago

In fact, I think I understant the code more clearly now but it still not consistent: The value reported in the article is 0.1^3 (0.001) but the value used in the loss is 0.00001 (0.1^5) The article doesn't mention applying Ldiff to the gan part (train_op_g) . Furthermore, since defining the loss as the sum of cross_entropy and cross_entropy_binary is not mentionned in the paper, is it significatively helping the performance ?

superlyc commented 5 years ago

In fact, I think I understant the code more clearly now but it still not consistent: The value reported in the article is 0.1^3 (0.001) but the value used in the loss is 0.00001 (0.1^5) The article doesn't mention applying Ldiff to the gan part (train_op_g) . Furthermore, since defining the loss as the sum of cross_entropy and cross_entropy_binary is not mentionned in the paper, is it significatively helping the performance ?

Did you get the reported F score? How did you process the raw ACE2005 data. In the code, it seems the sentences less than 8 words are filtered out. How about testing? If so, does it mean the test was not using all the data?

txy960427 commented 5 years ago

In fact, I think I understant the code more clearly now but it still not consistent: The value reported in the article is 0.1^3 (0.001) but the value used in the loss is 0.00001 (0.1^5) The article doesn't mention applying Ldiff to the gan part (train_op_g) . Furthermore, since defining the loss as the sum of cross_entropy and cross_entropy_binary is not mentionned in the paper, is it significatively helping the performance ?

Did you get the reported F score? How did you process the raw ACE2005 data. In the code, it seems the sentences less than 8 words are filtered out. How about testing? If so, does it mean the test was not using all the data? Using the all test data.During testing, it doesn't do padding as its parameter is False.