In your code feedforward function of module.py,
line 297:
outputs = tf.contrib.layers.fully_connected(**params)
line 301:
params = {"inputs": inputs, "num_outputs": num_units[1], "activation_fn": None}
I guess the key "inputs" of dict params at line 301 correspond to the value is outputs at line 297, according the fomulate: FFN(x) = max(0, xW1+b1)W2+b2 at fifth page in paper "Attention is all you need".
In your code feedforward function of module.py, line 297: outputs = tf.contrib.layers.fully_connected(**params) line 301:
params = {"inputs": inputs, "num_outputs": num_units[1], "activation_fn": None}
I guess the key "inputs" of dict params at line 301 correspond to the value is outputs at line 297, according the fomulate: FFN(x) = max(0, xW1+b1)W2+b2 at fifth page in paper "Attention is all you need".