A Abstractive Summarization Implementation with Transformer and Pointer-generator
395
stars
79
forks
source link
hi, i am struggling to understand attn_dists_projected = tf.map_fn(fn=lambda y: tf.scatter_nd(y[0], y[1], [dec_t, self.hp.vocab_size]), elems=(x, attn_dists), dtype=tf.float32) #13
Closed
yyht closed 5 years ago
just let the the index of attention distribution reflect to vocab index, and get the word piece attention