Closed LyndonCKZ closed 4 years ago
Yeah, I also notice that it seemly aggregates (sums) the feature vectors of the neighborhood nodes with same weight(1). The reweight way in aggregation operator doesn't belong to GCN or GAT. It really makes me confused.
Hi, we added back the hyperbolic attention mechanism (att_0) with an option (--use-att) to use it or not (this might make HGCN a bit slower due to dense matrix multiplications). Aggregation without attention is not using weights 1 since the adjacency matrix is normalized (see data_utils.py).
Hi, we added back the hyperbolic attention mechanism (att_0) with an option (--use-att) to use it or not (this might make HGCN a bit slower due to dense matrix multiplications). Aggregation without attention is not using weights 1 since the adjacency matrix is normalized (see data_utils.py).
Thanks. It seems the attention implementation can only be used for tiny datasets so far. The implementation is based on a sigmoid function to reweight the adjacency matrix which is different from the softmax alike attention in the proceedings. Meanwhile, as suggested in the paper, the aggregation is better conducted on the tangent space around each center node. However, the code implementation is all based on the tangent space of the origin.
See #18
Thanks for releasing the detailed code! However, I did not manage to find the attention mechanism mentioned in the paper. I do apologize if I missed something here. Meanwhile, it is hard to reproduce the results with the Hyperboloid model as all configurations listed are for Poincare only. Really appreciate it if the detailed configuration on the Hyperboloid model could be released as that is the major discussion conducted in the paper. Regards,