Harry24k / adversarial-attacks-pytorch

PyTorch implementation of adversarial attacks [torchattacks].
https://adversarial-attacks-pytorch.readthedocs.io/en/latest/index.html
MIT License
1.79k stars 337 forks source link

TorchAttack for Quantised pytorch model #150

Open mohanrajroboticist opened 1 year ago

mohanrajroboticist commented 1 year ago

✨ Short description of the feature [tl;dr]

torchattack doesnot work for pytorch quantised models

πŸ’¬ Detailed motivation and codes

Using torchattack for pytorch quantised models, the following error occurs.

image

In the attached colab notebook, quantise parameter in the dictionary can be set True or False to perform attack with or without quantisation https://colab.research.google.com/drive/1-_e8y0OoCYEkChKJOurETH4dHHKpRTYU?usp=sharing

rikonaka commented 1 year ago

Hi @mohanrajroboticist , I am not an expert in quantized models, and when I was running and debuging the code I noticed that your model doesn't seem to be able to find the gradient. I'm not sure if this is a feature of the model or something else? 😜

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In[2], line 10
      8 img = img.to(device)
      9 lab = lab.to(device)
---> 10 adv_images = atk(img, lab)

File [~/adversarial-attacks-pytorch/torchattacks/attack.py:465](https://vscode-remote+ssh-002dremote-002b192-002e168-002e1-002e33.vscode-resource.vscode-cdn.net/home/hero/expcode/~/adversarial-attacks-pytorch/torchattacks/attack.py:465), in Attack.__call__(self, images, labels, *args, **kwargs)
    463 self._change_model_mode(given_training)
    464 images = self._check_inputs(images)
--> 465 adv_images = self.forward(images, labels, *args, **kwargs)
    466 adv_images = self._check_outputs(adv_images)
    467 self._recover_model_mode(given_training)

File [~/adversarial-attacks-pytorch/torchattacks/attacks/fgsm.py:56](https://vscode-remote+ssh-002dremote-002b192-002e168-002e1-002e33.vscode-resource.vscode-cdn.net/home/hero/expcode/~/adversarial-attacks-pytorch/torchattacks/attacks/fgsm.py:56), in FGSM.forward(self, images, labels)
     53     cost = loss(outputs, labels)
     55 # Update adversarial images
---> 56 grad = torch.autograd.grad(cost, images,
     57                            retain_graph=False, create_graph=False)[0]
     59 adv_images = images + self.eps*grad.sign()
     60 adv_images = torch.clamp(adv_images, min=0, max=1).detach()

File [~/.local/lib/python3.9/site-packages/torch/autograd/__init__.py:303](https://vscode-remote+ssh-002dremote-002b192-002e168-002e1-002e33.vscode-resource.vscode-cdn.net/home/hero/expcode/~/.local/lib/python3.9/site-packages/torch/autograd/__init__.py:303), in grad(outputs, inputs, grad_outputs, retain_graph, create_graph, only_inputs, allow_unused, is_grads_batched)
    301     return _vmap_internals._vmap(vjp, 0, 0, allow_none_pass_through=True)(grad_outputs_)
    302 else:
--> 303     return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
    304         t_outputs, grad_outputs_, retain_graph, create_graph, t_inputs,
    305         allow_unused, accumulate_grad=False)

RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn
mohanrajroboticist commented 1 year ago

Hello @rikonaka , Thank you very much for your comment. With reference to the below information from PyTorch, the quantised tensors do not have gradients. https://discuss.pytorch.org/t/how-can-i-get-the-gradient-of-quantized-tensor/154280

I would like to ask, if there are any non gradient based adversarial attacks so that I can use it on quantised models?

rikonaka commented 1 year ago

Hi @mohanrajroboticist , there are many attack methods that are not based on gradient, I will give some examples here, such as CW attack and EAD attack, and many more, their optimization algorithms are not based on gradient.

There are still some problems in the torchattacks code, so your code won't run directly now. The torchattacks needs to infer the device where the model is located (such as cuda or cpu), but the quantized model does not have this feature, so it cannot run.

I also noticed that quantized models seem to run only on CPU (just a question of curiosity)? πŸ˜€

I have submitted the code pull request and soon you will be able to attack the quantized models in torchattacks (it is recommended to install from the source code, because the pip version is outdated). This is a my process of debugging code, which you can use as a reference for future attacks on quantised models. ☺️

https://colab.research.google.com/drive/11jqjxsxdQ07wmZb-UZAIJI4C8FlNsHVE?usp=sharing

mohanrajroboticist commented 1 year ago

Hello @rikonaka , Thanks, Quantised tensors in PyTorch is not supported in CUDA directly and the current support is available only for CPUs. For more information please check the quantised int8 data type in the table, https://pytorch.org/docs/stable/tensors.html#data-types

Quantised tensors do have device information and I have added a sample below. image

rikonaka commented 1 year ago

Wow, @mohanrajroboticist , it seems that today I have learned a new knowledge, although the device infomation is available, it is recommended that users set the device themselves. BTW, if you don't want to wait, you can clone and installation of my branch, merging to the master branch may take some time (I can't be sure of the time, it could be a few months if long, or maybe tomorrow if short, the merger.), or, if you don't mind, you can wait a while for the merge to complete.

mohanrajroboticist commented 1 year ago

Hello @rikonaka , I tested it with your branch, there are issues with the data type of attacked images. I have attached the colab notebook of my test below. Also can you please provide access to the colab reference notbook you shared.

https://colab.research.google.com/drive/1-_e8y0OoCYEkChKJOurETH4dHHKpRTYU?usp=sharing

rikonaka commented 1 year ago

Hello @rikonaka , I tested it with your branch, there are issues with the data type of attacked images. I have attached the colab notebook of my test below. Also can you please provide access to the colab reference notbook you shared.

https://colab.research.google.com/drive/1-_e8y0OoCYEkChKJOurETH4dHHKpRTYU?usp=sharing

I'm very sorry, I forgot to open the permission, it has been opened now, you can check it. πŸ˜₯ It seems to your code is work, so where is the problem?

image

If you mean the first two acc in image, that's problem with the model, not with torchattacks, maybe you should ask in the torch forum, I am not an expert in quantized models might not be able to help you with that. 🀯