cora dataset node classification performance under mettack

ChandlerBang / Pro-GNN

Implementation of the KDD 2020 paper "Graph Structure Learning for Robust Graph Neural Networks"

https://arxiv.org/abs/2005.10203

279 stars 45 forks source link

cora dataset node classification performance under mettack #5

Closed cxw-droid closed 3 years ago

cxw-droid commented 3 years ago

Hi,

Thanks for sharing your paper's code.

I tried to reproduce the cora dataset test results for classification accuracy under mettack as shown in Table 2 of your paper. The performance is much lower. (e.g. accuracy 0.8033 for ptb_rate 0.05, 0.7425 for ptb_rate 0.1, 0,6831 for ptb_rate 0.15) I used the default setting and ran the command as python train.py --dataset cora --attack meta --ptb_rate 0.1 --epoch 1000 (the ptb_rate changed as required). So can you please tell me what setting do you use or how can I reproduce a similar result?
The "Run the code " in your README file is broken. python train.py --dataset polblogs --attack meta --ptb_rate 0.15 --epoch 1000 output error AssertionError: ProGNN splits only cora, citeseer, pubmed, cora_ml

ChandlerBang commented 3 years ago

Hi thanks for your interest,

To reproduce the performance, please run the scripts in scripts folder as mentioned in README. For example,
```
sh scripts/meta/cora_meta.sh
```
To test performance under different severity of attack, you can change the ptb_rate in those bash files.
Thanks for pointing it out. I have just fixed the bug. You may now reclone DeepRobust and install it.
```
git clone https://github.com/DSE-MSU/DeepRobust.git
cd DeepRobust
python setup_empty.py install
```

Let me know if you have more questions.

cxw-droid commented 3 years ago

Thanks for your quick response.

I tested the code using sh scripts/meta/cora_meta.sh on dataset cora. I tested at different ptb_rate. The test results are as follows,

ptb_rate -> accuracy 0.05 → 0.8295 0.10 → 0.7822 0.15 → 0.7631 0.2 → 0.5739 0.25 → 0.5282

It seems the accuracy at ptb_rate 0.2 and 0.25 drops significantly and is much lower than expected. I changed only the ptb_rate in the shell script to run the test.

ChandlerBang commented 3 years ago

Hi, thanks for the feedback!

I just found that the hyper-parameters provided in the cora-meta scripts are not exactly the same as those used in the experiments. Please change lr=1e-3 to lr=5e-4 and epoch=400 to epoch=1000. I have also updated them in the script.

ChandlerBang commented 3 years ago

Btw, below are the results I got for cora-meta: (for one seed) 0.05 -> 0.8300 0.10 -> 0.7943 0.15 -> 0.7606 0.20 -> 0.7369 0.25 -> 0.6942

cxw-droid commented 3 years ago

Thanks, I got a similar result.

But when I tried with a couple of different seeds, the results were 0.02~0.03 lower for ptb_rate 0.2 and 0.25. Do you have the code/script to run multiple rounds of tests to get the mean and std of the accuracy?

ChandlerBang commented 3 years ago

Hi,

1) It could happen that some results are lower because the variance is relatively higher than other perturbation rates. I would suggest to run more seeds.

2) I cannot find the script now but it basically running the given command for 10 times (note that all the experiments are evaluated under seeds from 10 to 19). The python script does something like this

# filename: run.py
import os 
seeds = list(range(10, 20))
for seed in seeds:
    command = "python train.py --dataset cora --seed %s" %seed
    os.system(command)

Then you run

python run.py >> cora.out

Thecora.out file stores all the output from the program and we can simply write a script to extract the lines with the string "Test set results:" and obtain the results to calculate the mean/std.