About simplified GNNs in SeHGNN

skepsun commented 2 years ago

GAMLP actually follows the paradigm of using labels and multi-stage training from SAGN(+SLE). Moreover, GAMLP for ogbn-mag also 'copies' the combination of NARS and SAGN(+SLE). Unfortunately, the authors of GAMLP do not want to clarify these points and only compare final results with SAGN(+SLE) in their paper.

SeHGNN surely makes some contributions to HGNN, but I am surprised that SeHGNN paper has nothing about SAGN(+SLE) or even SIGN. Please check out the repositories and papers (and also their update dates) of GAMLP and SAGN(and HSAGN), and confirm above facts.

Yangxc13 commented 2 years ago

Thank you for your attention. The paper we put on arxiv is a pre-print manuscript rather than the final version. We are sure to listen to the advice from the community and update the paper before the end of this year.

SeHGNN is a pure heterogeneous graph algorithm and involves no experiments on homogenous graphs. Therefore, when we select baselines on the leaderboard of ogbn-mag, we only choose the top 1 (GAMLP at April 2022) and other methods which involve the content of heterogeneous graphs in their papers. It is our negligence not to pay attention to SAGN(+SLE). We will consider adding it to the next version of our SeHGNN manuscript.

In addition, the paradigm of using labels and multi-stage training can trace back to earlier papers like [1,2,3], which add confident nodes and their pseudo labels into the training set for training in the next stage. In SeHGNN we utilize a similar format as the above papers. GAMLP improves this paradigm by introducing RLU (reliable label distillation). We have tested RLU but we found no performance gains so we did not use it. We are glad to evaluate whether the paradigm in SAGN(+SLE) can further enhance our SeHGNN. If you are interested, we can keep in contact about it.

[1] Qimai Li, Zhichao Han, and Xiao-Ming Wu. 2018. Deeper insights into graph convolutional networks for semi-supervised learning. In Thirty-Second AAAI conference on artificial intelligence. [2] Ke Sun, Zhouchen Lin, and Zhanxing Zhu. 2020. Multi-stage self-supervised learning for graph convolutional networks on graphs with few labeled nodes. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 5892–5899. [3 ] Han Yang, Xiao Yan, Xinyan Dai, Yongqiang Chen, and James Cheng. 2021. Self-enhanced gnn: Improving graph neural networks using model outputs. In 2021 International Joint Conference on Neural Networks (IJCNN). IEEE, 1–8.

skepsun commented 2 years ago

Thanks for your kind response. If your carefully read two papers and associated codes, you can find that even the some formulas and codes (including the way of naming variables) are the same. They even couple GAMLP within label usage to 'outperform' pure SAGN without label usage, which may easily mislead the readers.

And I'm sorry for a bit emotional when first reading your paper. Because I have contacted the authors of GAMLP to remind them of clarifying that the paradigm of using labels and multi-stage training of GAMLP are nearly exact copy of SAGN+SLE. However, they promised not to publish but refused to add any explanations. (Their excuse is that using labels and multi-stage training are existing techniques, thus they did not 'copy' our paper. However, which is the first paper that incorporates these tricks into simplified/decoupled GNNs with important modifications? It's SAGN, not GAMLP.)

To be honest, I don't really know why authors from Peking University and Tencent could have done such things which are nearly plagiarism.

I forgot to point out that, actually, RLU is just a slightly modified copy of SLE in SAGN paper. The only differences between RLU and SLE are two additional hyperparameters and usage of soft labels.

GAMLP这篇论文的大部分技术点完全就是照搬SAGN的，他们唯一可以说是创新点的是那三个attention机制，这也是基于他们过往的工作，并且和SAGN真正存在“撞车”而不是抄袭的地方。好笑的是，他们并没有设计针对这几个机制的ablation study（他们也不敢加）。从结果来看，他们模型的大部分性能都来自于SAGN论文提出的label model和包含label model在内的SLE。他们鸡贼的地方在于：抄了，但是我假装没抄，还引用你（仅仅只是引用，并没有任何说明）。更离谱的是，为了公平比较SAGN与SIGN的性能，我特意把label model算作SLE内的，结果他们直接把label model算成是GAMLP的“一部分”，这样搞你GAMLP（有很强的label model）能不比SAGN（label model放到了SLE里）高吗？说实话这帮北大的作者抄了就抄了，我给他们发邮件，回复得还非常傲慢，发给他们院长也石沉大海，学术维权太难了，既然他们承诺不会公开发表，那我也暂时只能忍了。结果看到你们的文章，才意识到真的有人被他们给误导了，唉。

skepsun commented 2 years ago

GAMLP has been accepted by KDD 2022

Wow, maybe I should do something more.

ICT-GIMLab / SeHGNN

About simplified GNNs in SeHGNN #1