HKUDS / GraphGPT

[SIGIR'2024] "GraphGPT: Graph Instruction Tuning for Large Language Models"
https://arxiv.org/abs/2310.13023
Apache License 2.0
521 stars 39 forks source link

Why are the baseline results so low? #2

Closed ShuaiXuan closed 10 months ago

ShuaiXuan commented 10 months ago

hey bro!!Thank you very much for your work. I was greatly inspired after reading it. But I have a small question about the baseline result that I would like to ask you. I noticed you cited this article:https://arxiv.org/pdf/2305.19523.pdf. But the result of Arxiv is different from you. I see that you follow the same public division, but the GCN in this article reached 0.7182 and the SAGE reached 0.7171, while in your paper they were only 0.5267 and 0.5480 respectively. Why? 92511698673814_ pic

tjb-tech commented 10 months ago

Thank you very much for your attention on our GraphGPT! In our experiments, we don't use the same graph data with the paper https://arxiv.org/pdf/2305.19523.pdf. Actually, we have claimed in the Sec 4.1.2 that we encoded the raw text information using a pre-trained BERT model:

image

While the node mapping strategy is followed by the above paper and we re-run all the baselines. In particular, you can refer to https://huggingface.co/datasets/Jiabin99/All_pyg_graph_data to find our utilized graph data. Hope that my answer is helpful for you and thank you again for the issue!

Tebmer commented 10 months ago

Hi, your work is really interesting. I am new to this dataset.

I am little confused that did you ( or is it necessary to) rerun the baselines on your own data when you change the graph data in your paper? ( Looks like it's gonna be a lot of work)