Open TimeLessLing opened 3 years ago
Hi,
Thanks for your interest in our project.
The parameters (e.g. embedding matrix) can be either in hyperbolic space or Euclidean space. If it is in Euclidean space, we have to first apply exp mapping to map it to hyperbolic space before applying hyperbolic operations. It will be optimized with vanilla SGD. If it is in hyperbolic space, it will be optimized with hyperbolic SGD.
For node classification, we find that using all Euclidean parameters performs better than using hyperbolic parameters. That's why you observe that the list of hyperbolic parameters is empty. Let me know if there is any other problem. Thanks!
Best Regards, Qi
Hi,
Thanks for your interest in our project.
The parameters (e.g. embedding matrix) can be either in hyperbolic space or Euclidean space. If it is in Euclidean space, we have to first apply exp mapping to map it to hyperbolic space before applying hyperbolic operations. It will be optimized with vanilla SGD. If it is in hyperbolic space, it will be optimized with hyperbolic SGD.
For node classification, we find that using all Euclidean parameters performs better than using hyperbolic parameters. That's why you observe that the list of hyperbolic parameters is empty. Let me know if there is any other problem. Thanks!
Best Regards, Qi
Thank for your answer. But I still have a question about the forward process of RiemannianGNN.
In the first layer of GNN, the node representation will be transform into Lorentz space from exp_map_zero
, but in the next year, the feature in Lorentz space will be transform back to Euclidean Space by log_map_zero
and conduct aggregation in Euclidean. The only operation in Lorentz space is applying the activation to node_repr
, all the rest of the calculation process is done in Euclidean space, so what role does hyperbolic space play in it? How is it reflected?
Hi,
You are right. In order to apply message passing, we have to pass the representation to the Euclidean space with the log map. As shown in Section 3.2, we apply the activation after exp map to prevent model collapse to a vanilla Euclidean GCN.
What you described is just the first layer, where exp_map_zero and log_map_zero will cancel each other. However, at the upper layers, the operations will not be canceled due to the activation function. The hidden representations at the upper layers are already in the hyperbolic space.
Besides these, our final centroid aggregation for prediction is with hyperbolic distances.
Best Regards, Qi
Hi,
You are right. In order to apply message passing, we have to pass the representation to the Euclidean space with the log map. As shown in Section 3.2, we apply the activation after exp map to prevent model collapse to a vanilla Euclidean GCN.
What you described is just the first layer, where exp_map_zero and log_map_zero will cancel each other. However, at the upper layers, the operations will not be canceled due to the activation function. The hidden representations at the upper layers are already in the hyperbolic space.
Besides these, our final centroid aggregation for prediction is with hyperbolic distances.
Best Regards, Qi
Hi, I think you are right, the activation function in upper layer makes sure the operations will not be canceled. But what I am more confused about is that although the exp operation and the log operation are not completely reciprocal at this time, the two operations are still valid (they do implement their respective spatial transformation functions). So is there any special meaning in implementing the activation function in hyperbolic space?
As far as I know, the biggest advantage of hyperbolic space is that its negative curvature can avoid some shortcomings in the representation of high-dimensional data in Euclidean space. If it is specific to a specific space, such as the Poincaré space, it also has a characteristic that distance with radius exponential growth. So how does the application of activation function in hyperbolic space make use of the characteristics of hyperbolic space?
I’m sorry to ask you so many questions, I really do not understand the physical meaning of this part of the transformation.
Hi,
The activation just needs to ensure that the activated representation is still in the hyperbolic space. Therefore, you can use RELU etc here as long as it won't break this requirement. There should be some activation tailored for hyperbolic space recently. That can also be used. Thanks!
Hi, thank you for your interesting research and the high-quality code. I have successfully run the hgnn for the node_classification task.
But when I read the code, I found the args.hyp_vars is always an empty list during the training. I also print this variable and it is indeed empty, which means the hyperbolic_optimizer didnt't correspond to any parameter.
May I ask if this is normal?