Open crazygirlfym opened 5 years ago
目前训练过程未发现tf.exp出现异常值。DIN论文中并没有提到attention用softmax,只有在预测输出时有softmax(2),见图2。论文原文(P1063):"However different from traditional attention method, the constraint of \sum_i w_i = 1 is relaxed in Eq.(3), aiming to reserve the intensity of user interests. That is, normalization with softmax on the output of a(.) is abandoned." 当然,你也可以尝试在DSTN中attention使用softmax。
我想请问下,您在训练的时候有没有遇到过tf.exp异常值的问题造成auc跃变,因为din的attention是有softmax,只用tf.exp出现异常值,有没有什么好的避免方法?