DSE-MSU / DeepRobust

A pytorch adversarial library for attack and defense methods on images and graphs
MIT License
988 stars 192 forks source link

Some questions about pgd_test #95

Open likuanppd opened 2 years ago

likuanppd commented 2 years ago

Hi Jin

I meet some problems when I run the code pgd_test in graph global attack examples.

  1. In line 37, why normalize the feature matrix? This do harms the performance of GCN on clean graphs.

  2. There might be a bug in line 43: AttributeError: 'numpy.ndarray' object has no attribute 'todense' features loaded via the torch.geometric are np.ndarray but not a csr_matrix. I replace the data loader with the most common way in your other examples: data = Dataset(root='/tmp/', name=args.dataset, setting='gcn')

  3. Another problem is that neither the poisoning attack nor the evasion attack appear to be working. Do I miss some details? (base) D:\Nut\Experiments\Attack graphs>python pgd.py --dataset=cora --ptb_rate=0.05 Loading cora dataset... === testing GCN on clean graph === Test set results: loss= 0.6284 accuracy= 0.8170 === setup attack model === 100%|██████████████████████████████████████| 100/100 [00:06<00:00, 14.52it/s] === testing GCN on Evasion attack === Test set results: loss= 0.7325 accuracy= 0.8140 === testing GCN on Poisoning attack === Test set results: loss= 0.6890 accuracy= 0.8200

ChandlerBang commented 2 years ago

Hi, Thank you for your feedback.

  1. We follow the author's implementation to normalize the feature matrix. (see https://github.com/KaidiXu/GCN_ADV_Train/blob/master/train.py)

2/3. Are you using the latest code? The latest code should not have the "todense()" issue and I can get a reasonable attack performance.

$ python test_pgd.py  --dataset cora
=== testing GCN on clean graph ===
Test set results: loss= 0.7677 accuracy= 0.8190
=== setup attack model ===
100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 100/100 [00:06<00:00, 14.29it/s]
=== testing GCN on Evasion attack ===
Test set results: loss= 0.9961 accuracy= 0.7310
=== testing GCN on Poisoning attack ===
Test set results: loss= 1.0123 accuracy= 0.7340
likuanppd commented 2 years ago

Thanks for your reply. I find the reason anyway. The func-def preprocess is not the same in github/DeepRobust/utils.py as it in the Lib. I reinstall deeprobust-0.2.4, but it still report the "todense()"issue. Finally I solve this problem by copying this func directly, and the results become the same with yours.

But I'm still confused why running the code like I said would cause the attack to not work at all. It looks like the only difference is that the preprocess function is different.

likuanppd commented 2 years ago

I tried to load the dataset by data = Dataset(root='/tmp/', name=args.dataset, setting='gcn') It will destroy the performance of GCN on clean graph and make the attack fail, so the failure of the attack is caused by data loading.

data = Dataset(root='c:/tmp/', name=args.dataset, setting='gcn')

from torch_geometric.datasets import Planetoid
from deeprobust.graph.data import Pyg2Dpr
dataset = Planetoid('./', name=args.dataset)
data = Pyg2Dpr(dataset)

Is there any difference between these two dataloaders? I also tried data = Dataset(root='c:/tmp/', name=args.dataset, setting='nettack') PGD does not appear to be working in this data split.

(base) D:\Nut\Experiments\Attacke graphs>python pgd.py --dataset=cora
Loading cora dataset...
Selecting 1 largest connected components
=== testing GCN on clean graph ===
Test set results: loss= 0.6457 accuracy= 0.8441
=== setup attack model ===
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 100/100 [00:05<00:00, 16.94it/s]
=== testing GCN on Evasion attack ===
Test set results: loss= 0.6401 accuracy= 0.8426
=== testing GCN on Poisoning attack ===
Test set results: loss= 0.6511 accuracy= 0.8365
ChandlerBang commented 2 years ago

Thanks for your reply. I find the reason anyway. The func-def preprocess is not the same in github/DeepRobust/utils.py as it in the Lib. I reinstall deeprobust-0.2.4, but it still report the "todense()"issue. Finally I solve this problem by copying this func directly, and the results become the same with yours.

The issue was not addressed in deeprobust-0.2.4. So to install the latest deeprobust, we recommend to install from source:

git clone https://github.com/DSE-MSU/DeepRobust.git
cd DeepRobust
python setup_empty.py

But I'm still confused why running the code like I said would cause the attack to not work at all. It looks like the only difference is that the preprocess function is different.

I am not sure about this. It could depend on how you address the "todense" issue in the code.

ChandlerBang commented 2 years ago

Is there any difference between these two dataloaders?

Pyg dataloader provides fixed data splits and full graph as in the original GCN paper; gcn provides randomly selected splits with the same ratio as the GCN paper; nettack follows the nettack paper and uses a different ratio as well as only using the largest connected component of the graph.

I also tried data = Dataset(root='c:/tmp/', name=args.dataset, setting='nettack') PGD does not appear to be working in this data split.

The data provided by Pyg dataloader is the same as the PGD data. We followed their implementation and we are able to reproduce the performance on this data split.

Well, this is interesting. I got this result as yours and I am not very sure about why PGD fails on this setting data = Dataset(root='c:/tmp/', name=args.dataset, setting='nettack'). But I commented https://github.com/DSE-MSU/DeepRobust/blob/756453e894df2acd154c0016b9e25836c8960b27/examples/graph/test_pgd.py#L91-L96 and uncommented https://github.com/DSE-MSU/DeepRobust/blob/756453e894df2acd154c0016b9e25836c8960b27/examples/graph/test_pgd.py#L90, I am able to get the following performance:

$python test_pgd.py --dataset cora
Loading cora dataset...
Selecting 1 largest connected components
=== testing GCN on clean graph ===
Test set results: loss= 0.7055 accuracy= 0.8295
=== setup attack model ===
100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 100/100 [00:08<00:00, 12.39it/s]
=== testing GCN on Evasion attack ===
Test set results: loss= 0.7221 accuracy= 0.8260
=== testing GCN on Poisoning attack ===
Test set results: loss= 0.9361 accuracy= 0.7651

What I want to note here is that, the very early test_pgd.py uses Line 90 (the setting in Metattack) instead of Line 91-96 (the setting in PGD attack). Line 90 consistently yields reasonable poisoning performance while failing on evasion attack; Line 91-96 usually produces better evasion performance but it seems (as you pointed out) it can sometimes fail in attacking some data splits. Maybe Line 91-96 does not work for the largest connected component setting.

likuanppd commented 2 years ago

The Line 90 seems equal to Meta-train, and Line 91-96 are similar to Meta-self. I conduct more experiments, and the results show that PGD-evasion only works on the standard split (torch_geometric setting), no matter commented line 90 or line 91-96.

Thanks for your helpful replies.

likuanppd commented 1 year ago

Hi Jin, I have published a paper in ICLR for this problem, which can be referred here --- https://openreview.net/forum?id=dSYoPjM5J_W&referrer=%5BAuthor%20Console%5D(%2Fgroup%3Fid%3DICLR.cc%2F2023%2FConference%2FAuthors%23your-submissions)

ChandlerBang commented 1 year ago

Hi @likuanppd, thanks for the insightful work! I really like the observation about the connection between robustness and distribution shift. It is interesting to see how it explains some phenomena in graph adversarial attacks.

Btw, (1) if you would like to contribute to deeprobust, feel free to pull request. (2) I would also like to mention our (slightly related) work on ICLR'23 that you might be interested in.

yuChen-XD commented 1 year ago

Hi @ChandlerBang, I find that the test_pgd.py now in the Deeprobust is different from the above code you showed in line 91 to 96. The code you showed is

# Here for the labels we need to replace it with predicted ones 
 fake_labels = target_gcn.predict(features.to(device), adj.to(device)) 
 fake_labels = torch.argmax(fake_labels, 1).cpu() 
 # Besides, we need to add the idx into the whole process 
 idx_fake = np.concatenate([idx_train,idx_test]) 
 model.attack(features, adj, fake_labels, idx_fake, perturbations, epochs=args.epochs) 

The current version code is

    # Here for the labels we need to replace it with predicted ones
    fake_labels = target_gcn.predict(features.to(device), adj.to(device))
    fake_labels = torch.argmax(fake_labels, 1).cpu()
    # Besides, we need to add the idx into the whole process
    idx_fake = np.concatenate([idx_train,idx_test])

    idx_others = list(set(np.arange(len(labels))) - set(idx_train))
    fake_labels = torch.cat([labels[idx_train], fake_labels[idx_others]])
    model.attack(features, adj, fake_labels, idx_fake, perturbations, epochs=args.epochs)

And I run the code you showed with ''data = Dataset(root='./tmp/', name='cora', setting='nettack')", and the result is

Loading cora dataset... Selecting 1 largest connected components === testing GCN on clean graph === Test set results: loss= 0.6980 accuracy= 0.8320 === setup attack model === === testing GCN on Evasion attack === Test set results: loss= 0.8186 accuracy= 0.7842 === testing GCN on Poisoning attack === Test set results: loss= 0.8061 accuracy= 0.7902

The result is different from likuanppd. But when I run the current version of test_pgd.py. The result showed below is similar to the result likuanppd showed.

Loading cora dataset... Selecting 1 largest connected components === testing GCN on clean graph === Test set results: loss= 0.6980 accuracy= 0.8320 === setup attack model === === testing GCN on Evasion attack === Test set results: loss= 0.7109 accuracy= 0.8239 === testing GCN on Poisoning attack === Test set results: loss= 0.6950 accuracy= 0.8255

However, the current version of test_pgd.py has a bug. The true index of 'fake_labels' is inconsistent with the 'idx_fake' in "model.attack(...)". I would like to confirm if my discovery is correct. Thank you for your help in advance.