danielzuegner / gnn-meta-attack

Implementation of the paper "Adversarial Attacks on Graph Neural Networks via Meta Learning".
https://www.kdd.in.tum.de/gnn-meta-attack
MIT License
143 stars 26 forks source link

Cannot reproduce the performance on citeseer and polblogs #3

Closed HappierTreeFriend closed 5 years ago

HappierTreeFriend commented 5 years ago

Hi Daniel,

I run your code on different datasets, but I just cannot reproduce the results in the paper.

The reported GCN misclassification rate on citeseer dataset (clean) in your paper is 28.5 ± 0.9, howover, in my implemention (I used pygcn ), the misclassification rate is about 25%.

I run your code to generate 5% perturbed graph and use it as input, then get 73.7% classification accuracy. So on citeseer dataset, with 5% perturbations, the performance of GCN model only drop 1-2%, not 6% as in the paper.

I don't know what is wrong, and I am wondering if there are some more parameters of the attacker I need to tune.

I would really appreciate it if you could help me with this.

danielzuegner commented 5 years ago

Hi,

thanks for creating this issue. I've investigated what might be going on. I think most of it boils down to the fact that pygcn does things slightly different than we (and sometimes the GCN paper; see my last point) do.

The difference in the 'clean' accuracy seem to be due to the fact that we do not perform feature normalization. When I turn off feature normalization the performance of pygcn is pretty close to what we get.

I believe the difference in the reduction of performance through the attack is due to pygcn using L2 regularization, which we do use not during our attack. After quickly hacking in a change to line 529 of meta_gradient_attack.py:

loss = tf.reduce_mean(loss_per_node) + 5e-4* (tf.nn.l2_loss(current_weights[0]) + \
    tf.nn.l2_loss(current_weights[1]) + tf.nn.l2_loss(current_biases[0]) + \
    tf.nn.l2_loss(current_biases[1]))

the performance of pygcn dropped by roughly 5 points, similar to our paper. Of course we have to make sure to use the same split that was used during the attack.

Additionally, the pygcn implementation appears to perform a different normalization of the adjacency matrix than was proposed in the original GCN paper. That is, instead of normalizing using D_tilde^(-1/2) A_tilde D_tilde^(-1/2) (Eq. (2) in the paper) they use D_tilde^(-1) A_tilde.

I hope this clarification helps -- please let me know if you have any further questions.

HappierTreeFriend commented 5 years ago

Thanks so much for your instant reply!!!

Following you advice, now I get similar results on citeseer and cora_ml dataset!

But I still cannot reproduce the results on Polblogs. Since it does not have node features, I treat every node feature as one hot vector, like you said in another issue. Then the 'clean' accuracy is about 96%, while on 'meta-self' perturbed dataset it get an accuracy of 87%, not roughly 78% as reported in the paper.

Did you use some other tricks to deal with the Polblogs dataset?

danielzuegner commented 5 years ago

Hi,

I did a deep-dive into the original implementation and results we used for the paper. It seems I have missed a few details when re-implementing everything from scratch for publication.

Then I've re-attacked both PolBlogs and Citeseer and got results very similar to what we report in the paper. Thank you very much for pointing this out, I hope you can get the desired results now!

HappierTreeFriend commented 5 years ago

Cool!!! You've been very helpful, and I really admire you for your earnest attitude!