Clarification on attacks used in Adversarial Training

madarax64 commented 1 year ago

Hello, I was trying to use NSL to implement adversarial training on my custom model, so I followed the default steps in the tutorial video, which worked like a charm. While studying the code, I noticed that the call to make_adv_reg_config() has a parameter called pgd_epsilon, which is "...Only used in Projected Gradient Descent (PGD) attack".

This statement suggests that NSL can use different attacks in adversarial training; however, it is not clear how to select which attack to use, or which attack is currently in use. Up till now I had assumed that PGD was being used by default, as this is common in literature, but I would like to know if this is actually the case, and by extension if it is possible to use a different attack and how that can be done.

Thanks!

csferng commented 1 year ago

Thanks for the question, @madarax64!

NSL supports both PGD attack and Fast Gradient Sign Method (FGSM) attack. Given that FGSM is a special case of PGD, the same PGD procedure resembles FGSM attack when pgd_iteration=1, random_init=False, and adv_grad_norm='infinity'.

The parameter pgd_epsilon is only needed when pgd_iteration > 1, to control per-iteration step size. The parameter adv_step_size controls overall step size, i.e. how far an adversarial example can deviate from the original example.

madarax64 commented 1 year ago

Hi @csferng , Thanks, got it now. I've enabled random initialization, and set the pgd_iteration, pgd_epsilon and adv_step_size arguments as needed. I appreciate the clarification!

madarax64 commented 1 year ago

Hi @csferng , So following from above, I was trying to use PGD adversarial training, so I set the arguments as mentioned above, as well as set random_init to True (to mimic PGD's random initializations). However, this causes the following error (edited for clarity):

ValueError: Exception encountered when calling layer "AdversarialRegularization" (type AdversarialRegularization).

    in user code:

        File "/home/madarax64/.conda/envs/tf28/lib/python3.10/site-packages/neural_structured_learning/keras/adversarial_regularization.py", line 682, in call  *
            adv_loss = adversarial_loss(
        File "/home/madarax64/.conda/envs/tf28/lib/python3.10/site-packages/neural_structured_learning/keras/adversarial_regularization.py", line 138, in adversarial_loss  *
            adv_input, adv_sample_weights = adversarial_neighbor.gen_adv_neighbor(
        File "/home/madarax64/.conda/envs/tf28/lib/python3.10/site-packages/neural_structured_learning/lib/adversarial_neighbor.py", line 415, in gen_adv_neighbor  *
            return adv_helper.gen_neighbor(input_features, pgd_labels)
        File "/home/madarax64/.conda/envs/tf28/lib/python3.10/site-packages/neural_structured_learning/lib/adversarial_neighbor.py", line 306, in gen_neighbor  *
            perturbations = utils.random_in_norm_ball(
        File "/home/madarax64/.conda/envs/tf28/lib/python3.10/site-packages/neural_structured_learning/lib/utils.py", line 179, in random_in_norm_ball  *
            tensor_structure)

        ValueError: Cannot convert a partially known TensorShape (None, 17) to a Tensor.

    Call arguments received:
      • inputs={'feature': 'tf.Tensor(shape=(None, 17), dtype=float32)', 'label': 'tf.Tensor(shape=(None, 46), dtype=float32)'}
      • kwargs={'training': 'True'}

To confirm, setting random_init to False does not cause this error. Could you kindly let me know how to fix this?

csferng commented 1 year ago

Thanks for reporting the issue, @madarax64.

The error is due to that nsl.lib.utils.random_in_norm_ball expects the tensor shape to be fully known. This can be relaxed by changing t.shape to tf.shape(t) in random_in_norm_ball. Alternatively, you may set run_eagerly=True in model.compile, which runs the model in eager mode so all tensor shapes are known. (Eager mode will slow down model training quite a bit, though.)

madarax64 commented 1 year ago

Hi @csferng , I can confirm that this fixed the issue for me. I can see that there's a commit referencing this. I take it this is now supported right from the codebase?

csferng commented 1 year ago

Hi @madarax64, sorry for the late reply. Yes, this is supported from the code base, and will be included in the next release. Thank you!

madarax64 commented 1 year ago

No worries @csferng , thanks for the confirmation and the excellent work!

tensorflow / neural-structured-learning

Clarification on attacks used in Adversarial Training #127