PGD-based adversarial training for Pointnet++

jiachens / ModelNet40-C

Repo for "Benchmarking Robustness of 3D Point Cloud Recognition against Common Corruptions" https://arxiv.org/abs/2201.12296

https://sites.google.com/umich.edu/modelnet40c

BSD 3-Clause "New" or "Revised" License

216 stars 25 forks source link

PGD-based adversarial training for Pointnet++ #12

Closed xingbw closed 2 years ago

xingbw commented 2 years ago

Hi, jiachen. In the paper, it says " PointNet++ and RSCNN leverage ball queries to find neighboring points, which will hinder the gradients from backward propagating to the original point cloud, making adversarial training inapplicable." I am wondering why the ball query operation hinder the gradients from backward propagating. If so, is the optimization of pointnet++ in standard training mode also be influenced? This makes me confused, hope to receive your reply!

jiachens commented 2 years ago

Hi xingbw,

Thanks for your question! Ball queries use randomly selected points or furthest point sampling to choose anchor points for clustering, but this process is not differentiable w.r.t the input. It is worth noting that they will not affect the standard training since standard training only requires gradient flow to the "model parameters", but ball queries do not involve any "trainable model parameters". However, adversarial training requires gradient flow to the "input". That is the reason why AT cannot work well for these random clustering operations.

Our new paper "PointDP: Diffusion-driven Purification against Adversarial Attacks on 3D Point Cloud Recognition" also shows that AT on kNN layers also suffers from such limitations.

Please let me know if you have any further questions.

xingbw commented 2 years ago

Hi, thanks for your reply! It has solved my confusion for gradient backward propagating in pointnet++.

However, after reading PointDP, another question arises for me. In section 4.3 of PointDP, it says "gradient backward propagation through kNN layers is indexing, which is not non-smooth". Does it mean that, the gradient flow can only backward propagate to the index, instead of the features (and the input). I'm not sure if my understanding is correct.

And from the results in Table 3, for DGCNN and PCT, does the gradient obfuscation problem only occur in black-box attacks? If not, why the robust accuracy of white-box attacks is not influenced that much?

Sorry to bother you again. Looking forward to your reply.

jiachens commented 2 years ago

That is a very good question. Because white-box attack also needs to use the gradient to launch attacks. However, the gradient is not the "real gradient" due to indexing. Therefore, black-box attacks that approximate the gradient is even more close to the real gradient of the model.

jiachens commented 2 years ago

If you do not have further questions, I will close this issue.

xingbw commented 2 years ago

Quite sorry for the delayed reply. Thanks very much!!! However, for the adversarial training, PCT model also has sampling and group operation like pointnet++, but I notice the paper implements the PGD method for PCT and improves its performance from 25.5 to 18.4 . Does it mean the PGD-adversarial training may also be helpful for pointnet++ actually?