Questions Regarding DPBGA's Performance with Different Target Classes and Defenses

Joney-Yf commented 4 days ago

I've been working with DPBGA and have encountered some issues that I'd like to clarify:

ASR Drops to Zero with Different Target Class:

When I change the target class (e.g., to Flickr), the Attack Success Rate (ASR) drops to 0%. Is this expected behavior?

Effectiveness Against Prune Defense:

After switching the defense strategy to pruning, I noticed the ASR decreases significantly—from over 90% down to around 10%. Does this imply that DPBGA is ineffective against pruning defenses?

Trigger Effectiveness Without Dataset Poisoning:

Even without poisoning the dataset, the trigger remains effective after training. Does this suggest that DPBGA functions more as an adversarial attack rather than a backdoor attack?

I would greatly appreciate any insights or explanations regarding these observations.

Thank you!

zzwjames commented 17 hours ago

Hi,

Thanks for your questions.

Attack performance for different target classes I believe nodes from certain classes are inherently harder to attack. I believe achieving high attack performance across different classes is a topic worth studying.

Effectiveness against pruning defense Our trigger generator is flexible to incorporate homophily loss, as proposed in UGBA, enabling the triggers to remain in-distribution while also bypassing the pruning defense.

Trigger effectiveness without dataset poisoning Since GNNs utilize neighbor information, when DPGBA generates in-distribution triggers similar to the original neighbors of the target nodes, it can still result in a successful attack, even if the GNN is trained on a clean graph.

Joney-Yf commented 12 hours ago

Hi,

Thanks for your questions.

Attack performance for different target classes I believe nodes from certain classes are inherently harder to attack. I believe achieving high attack performance across different classes is a topic worth studying.

Effectiveness against pruning defense Our trigger generator is flexible to incorporate homophily loss, as proposed in UGBA, enabling the triggers to remain in-distribution while also bypassing the pruning defense.

Trigger effectiveness without dataset poisoning Since GNNs utilize neighbor information, when DPGBA generates in-distribution triggers similar to the original neighbors of the target nodes, it can still result in a successful attack, even if the GNN is trained on a clean graph.

Hello, thanks for your reply.

I observed that by adding homo_loss, the attack is able to bypass Prune defenses effectively, similar to the approach described in the UGBA paper.

However, during testing, instead of pruning before training, I applied an Out-Of-Distribution (OOD) defense using the reconstruct_prune function and removed the top 15% of anomalous edges from the generated poison samples. The results were as follows:

Total Overall ASR: 0.0174 Total Clean Accuracy: 0.8370 Does this indicate that DPGBA becomes ineffective when the OOD defense strength is increased?

Additionally, I noticed that the generated triggers carry target class information, which influences feature aggregation. This seems to blur the distinction between backdoor and adversarial attacks, as backdoors typically require injecting specific patterns through poisoning. However, in practice, attacks can succeed without injecting such patterns during training, which appears to contradict the fundamental premise of backdoor attacks.

Could you provide any insights on whether the reduced ASR under stronger OOD defenses suggests a limitation of DPGBA, and how the characteristics of trigger generation and feature aggregation in GNNs influence the differentiation between backdoor and adversarial attacks?

Thanks again

zzwjames / DPGBA

Questions Regarding DPBGA's Performance with Different Target Classes and Defenses #2