Open DorianKodelja opened 6 years ago
In fact, I think I understant the code more clearly now but it still not consistent: The value reported in the article is 0.1^3 (0.001) but the value used in the loss is 0.00001 (0.1^5) The article doesn't mention applying Ldiff to the gan part (train_op_g) . Furthermore, since defining the loss as the sum of cross_entropy and cross_entropy_binary is not mentionned in the paper, is it significatively helping the performance ?
In fact, I think I understant the code more clearly now but it still not consistent: The value reported in the article is 0.1^3 (0.001) but the value used in the loss is 0.00001 (0.1^5) The article doesn't mention applying Ldiff to the gan part (train_op_g) . Furthermore, since defining the loss as the sum of cross_entropy and cross_entropy_binary is not mentionned in the paper, is it significatively helping the performance ?
Did you get the reported F score? How did you process the raw ACE2005 data. In the code, it seems the sentences less than 8 words are filtered out. How about testing? If so, does it mean the test was not using all the data?
In fact, I think I understant the code more clearly now but it still not consistent: The value reported in the article is 0.1^3 (0.001) but the value used in the loss is 0.00001 (0.1^5) The article doesn't mention applying Ldiff to the gan part (train_op_g) . Furthermore, since defining the loss as the sum of cross_entropy and cross_entropy_binary is not mentionned in the paper, is it significatively helping the performance ?
Did you get the reported F score? How did you process the raw ACE2005 data. In the code, it seems the sentences less than 8 words are filtered out. How about testing? If so, does it mean the test was not using all the data? Using the all test data.During testing, it doesn't do padding as its parameter is False.
In section 4.5 of the paper Ldiff is combined to the softmax loss using lambda to optimize Θd. As I understand it, it should be applied to Θg, not Θd since Ldiff is only a function of og and oghat In the code there are two lambdas:
self.train_op_g = optimizer_g.minimize(self.g_loss + 0.1 * self.diff_loss, var_list=vars_g) self.total_loss = self.loss + self.l2_loss + self.diff_loss * 0.00001
and the diff loss is applied to g, not d. Is the code the proper version ?