Closed pbonazzi closed 2 years ago
Hi, I also don't understand why it's a Hadamard product of K and Q here? Do you get it clear now?
Hi @pbonazzi, @GaichaoLee,
After the element wise multiplication using the code snippet that you have quoted above, there is a sum
which is applied across all the feature dimension (d=hidden_dim/num_heads
) to get the final scalars. Effectively, it is a dot product.
The elementwise multiplication helps in to maintain a d
dimensional edge feature that is used in the GraphTransformer with edge features layer.
Please refer to the detailed explanation in this issue https://github.com/graphdeeplearning/graphtransformer/issues/4
Thanks for your reply! I debug the code again and know how you get the dot product now.
Hi ! Congratulations for your paper and thank you for making the implementation publicly available as well.
Quick question on this function :
Why do you do a multiplication of K and Q and not a dot product? The dimensions of the scores are [num_edges, num_heads, hidden_dim/num_heads]. But I expect a [num_edges,num_edges] matrix .
You can also reach me here : pietrobonazzi.edu@gmail.com Hope to hear from you soon , Pietro Bonazzi