TianxiangZhao / GraphSmote

Pytorch implementation of paper 'GraphSMOTE: Imbalanced Node Classification on Graphs with Graph Neural Networks' to appear on WSDM2021
150 stars 29 forks source link

The GraphSMOTE(T) results were not satisfactory #6

Open cheng-zhimin opened 1 year ago

cheng-zhimin commented 1 year ago

Hi, I'd like to restate the results of the GraphSMOTE(T) experiment, I typed in this command line:

python main.py --imbalance --no-cuda --dataset=cora --setting='recon'

The results are as follows:

ACC = 0.271, OC-ROC = 0.5000, F Score = 0.1843

I have tried many times but couldn't get the results of GraphSMOTE(T) in the paper, is there something wrong with my command line? Or is there something else wrong with my operation? Thank you.

TianxiangZhao commented 1 year ago

Hi, I remember that 'recon' is only used to pretrain the edge predictor and will not optimize the classifier. You can save the pretrained model and run with settings like 'newG_cls' following the Readme. To test the result without pretraining, you can try directly run with settings like 'recon_newG' without loadling any checkpoints.

cheng-zhimin commented 1 year ago

您好,您的回复我已经收到了,感谢您百忙中抽为我解答。为了方便交流,请允许我使用中文进行交流。根据您在论文中的描述GraphSMOTE(T)是仅使用边预测任务的损失进行训练,而GraphSMOTE(0)的训练则用到了边预测任务的损失和节点分类任务的损失。结合我对Readme文件的理解,设置settings = 'recon_newG'对应的模型是GraphSMOTE(0),而非GraphSMOTE(T)。请问是我对Readme文件的理解有误吗?期待您的回复,再次感谢!

------------------ 原始邮件 ------------------ 发件人: "TianxiangZhao/GraphSmote" @.>; 发送时间: 2022年11月14日(星期一) 凌晨2:02 @.>; @.**@.>; 主题: Re: [TianxiangZhao/GraphSmote] The GraphSMOTE(T) results were not satisfactory (Issue #6)

Hi, I remember that 'recon' is only used to pretrain the edge predictor and will not optimize the classifier. You can save the pretrained model and run with settings like 'newG_cls' following the Readme. To test the result without pretraining, you can try directly run with settings like 'recon_newG' without loadling any checkpoints.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

TianxiangZhao commented 1 year ago

Hi, 非常抱歉这么晚才回复。在没有load checkpoint的情况下,recon_newG对应的是GraphSMOTE(O), newG_cls对应的是GraphSMOTE(T). 如果用recon预训练了并且load checkpoint情况下,他们分别对应(O_pre)和(T_pre)。

希望能帮到您

cheng-zhimin commented 1 year ago

收到您的回复实在是太令人兴奋了,感谢您对我所提出的问题的重视以及提供的帮助。目前我在不平衡图的研究取得了一些小进展,我在做实验时,根据您在github提供的链接获取Cora和BlogCatalog数据集,但您所提供的Twitter数据集连接已经失效,我也尝试了自行处理citeseer、pubmed这些数据集。请问您方便提供您所使用的twitter数据集吗?再次感谢您的不吝赐教!

------------------ 原始邮件 ------------------ 发件人: "TianxiangZhao/GraphSmote" @.>; 发送时间: 2023年2月23日(星期四) 凌晨4:28 @.>; @.**@.>; 主题: Re: [TianxiangZhao/GraphSmote] The GraphSMOTE(T) results were not satisfactory (Issue #6)

Hi, 非常抱歉这么晚才回复。在没有load checkpoint的情况下,recon_newG对应的是GraphSMOTE(O), newG_cls对应的是GraphSMOTE(T). 如果用recon预训练了并且load checkpoint情况下,他们分别对应(O_pre)和(T_pre)。

希望能帮到您

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

TianxiangZhao commented 1 year ago

抱歉我现在只有原始的Twitter数据集了. Node embedding 文件, 'twitter.embeddings_64', 是调用DeepWalk包生成的. 你可以参考 'load_data_twitter()' 这个函数,在data_load.py文件里. 如果你想从 Twitter数据集里提取subgraph, 可以使用提供的'Extract_graph()'这个函数. 为带来的不便非常抱歉!

TianxiangZhao commented 1 year ago

twitter_fake_ids.cvs我发现过期了,刚刚加到了data这个文件夹里

cheng-zhimin commented 1 year ago

感谢您的指导,我在按照您的方法对twitter数据集进行处理的时候,发现缺少twitter.csv文件,我尝试了不少办法,都找不到与论文对应的twitter.csv文件,请问您可以提供吗?谢谢

------------------ 原始邮件 ------------------ 发件人: "TianxiangZhao/GraphSmote" @.>; 发送时间: 2023年2月24日(星期五) 凌晨0:37 @.>; @.**@.>; 主题: Re: [TianxiangZhao/GraphSmote] The GraphSMOTE(T) results were not satisfactory (Issue #6)

twitter_fake_ids.cvs我发现过期了,刚刚加到了data这个文件夹里

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

cheng-zhimin commented 1 year ago

你好,我已经通您在github上提供的链接下载了twitter.csv文件了,之前是由于我的网络问题,所以页面刷新不了。再次感谢您对我的帮助和指导!

------------------ 原始邮件 ------------------ 发件人: "TianxiangZhao/GraphSmote" @.>; 发送时间: 2023年2月24日(星期五) 凌晨0:37 @.>; @.**@.>; 主题: Re: [TianxiangZhao/GraphSmote] The GraphSMOTE(T) results were not satisfactory (Issue #6)

twitter_fake_ids.cvs我发现过期了,刚刚加到了data这个文件夹里

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>