lc222 / MPCNN-sentence-similarity-tensorflow

Multi-Perspective Sentence Similarity Modeling with Convolution Neural Networks论文实现
69 stars 27 forks source link

请问加入attention的代码要怎样使用呢? #7

Open yuye2133 opened 6 years ago

yuye2133 commented 6 years ago

我的代码是这么做的,但是最后loss是nan,请问您碰到过这种问题吗? x1_embedded_chars = tf.nn.embedding_lookup(word2vec, self.input_x1) x2_embedded_chars = tf.nn.embedding_lookup(word2vec, self.input_x2) self.x1_embedded_expand = tf.expand_dims(x1_embedded_chars, -1) self.x2_embedded_expand = tf.expand_dims(x2_embedded_chars, -1)

self.attention_x1, self.attention_x2 = self.attention()

这里的attention就是您那个函数,后面就是安装正常的两个句子分别做卷积,池化。 卷积的size变成了[filter_size, 2*embedding_size, 1, num_filters],之后的代码就都是一样了。 但是训练的时候loss变成了nan,报错 InvalidArgumentError (see above for traceback): Nan in summary histogram for: output/b_0/grad/hist 请问您遇到过这个问题吗,怎么解决的呢?