The equation (10) in the paper may be miswritten.

CM-BF commented 1 year ago

Hi Yujie,

Congrats to the acceptance of your new work! After reading the paper, there are two questions I hope to discuss.

Is the equation (10) miswritten? I expect it to be like $\mathcal{O}(\log{N_S}^{1/2} NS^{-1/2 (D{\chi^j} + 1)} )$ instead of $\mathcal{O}(\frac{\log{N_S}^{1/2}}{NS^{-1/2 (D{\chi^j} + 1)}} )$ so that the generalizability bound can be compared as described in Remark* 4.12..
Can you provide an intuitive explanation why applying Wasserstein distance based representations improves the size generalization ability? According to the ablation study between WBM and $\text{WBM}^{1/N}$, is the learnable node weights the key? I'm a bit curious about whether the node weights are fixed in EBM?

Thank you for your time. I'm looking forward to your reply!

Best, Shurui

JinYujie99 commented 1 year ago

Hi Shurui, Thanks for your interest in our work and raising the issue.

We are sorry that we have indeed miswritten equation (10), and it should be $\mathcal{O}(\log{N_S}^{1/2} * NS^{-1/2 (D{\chi^j} + 1)} )$ as you say. We will correct it in the arxiv version. Thank you for pointing this out!
An intuitive explanation: Firstly, learning the barycenter considers graph-level consensus (while making the learning convergence rate with respect to the number of nodes controllable), thus enhancing generalization. Secondly, the Wasserstein metric is stronger than the Euclidean metric in implicitly considering the optimal node matching.
About the node weight: We perceive that learning node weights weakens the effect of incorrect size settings of barycenters, and is one of the key elements of our method. The EBM in the paper has already considered the learnable node weights to some extent, in the sense that the vectorized class-wise euclidean barycenters are learned end-to-end. We also conducted additional experiments w.r.t learning the node weights and node features (rather than directly learn the pooled graph embedding as the barycenter) in EBM. The details can be found in our rebuttal to Reviewer yn5M in https://openreview.net/forum?id=Z1I4WrV5TG.

CM-BF commented 1 year ago

Hi Yujie,

Thanks a lot for the explanations!

What is a pity that I cannot access the link https://openreview.net/forum?id=Z1I4WrV5TG, since the reviews are not public to anyone. I'll appreciate it if you could share it. :) I am not sure that I understand "The EBM in the paper has already considered the learnable node weights to some extent, in the sense that the vectorized class-wise euclidean barycenters are learned end-to-end." If applicable, any examples are appreciated.

Best, Shurui

JinYujie99 commented 1 year ago

Hi Shurui,

Sorry for that I didn't notice the link is not public and I don't know how to share this link so far.

We consider three variants of EBM: 1.EBM(in the paper): The pooled graph embedding of the Euclidean barycenters are modeled as learnable parameters. Then, for each barycenter, there is only one vector to be learned, and it can be seen as a weighted sum of node features. That is why I said that "The EBM in the paper has already considered the learnable node weights to some extent" :) 2.EBM+: Both the node weights and the node features of the Euclidean barycenters are modeled as learnable parameters. 3.EBM-1/N: The node weights are fixed as 1/N. The node features of the Euclidean barycenters are modeled as learnable parameters. Through the experiments, we found that both EBM(in the paper) and EBM+ have similar performance, and they are better than EBM-1/N. We only included the first version of EBM in the final paper.

Best, Yujie

CM-BF commented 1 year ago

Thanks a lot for your explanations!

JinYujie99 / WBM

The equation (10) in the paper may be miswritten. #2