Closed Tenpi closed 5 months ago
Hi @Tenpi , first question, why do you need to use adversarial attacks? Second question, how do you achieve this? The third question, have you learned about adversarial attacks?
I want to use it so that the blip model captions the image differently. I already posted the code I used to achieve the image, I'm not sure how well this library can be used on different models, as I explained before I had to modify the forward function to ignore the caption parameter and expand the dims of the return tensor. I know that adversarial attacks can be used to confuse a model using only noise which is barely visible.
The full code of BLIP (model I'm using) is here: https://github.com/salesforce/BLIP
I want to use it so that the blip model captions the image differently. I already posted the code I used to achieve the image, I'm not sure how well this library can be used on different models, as I explained before I had to modify the forward function to ignore the caption parameter and expand the dims of the return tensor. I know that adversarial attacks can be used to confuse a model using only noise which is barely visible.
The full code of BLIP (model I'm using) is here: https://github.com/salesforce/BLIP
The adversarial attack requires that the model being attacked is a classifier (input images
, output labels
) rather than a transfomer, FGSM and PGD both compute adversarial examples based on classifier gradients, others like the CW attack have their own optimisation methods, the essence of which is still to get the original benign example across the decision boundary as short as possible, I think you should read a few papers on basic adversarial attacks instead of just knowing what the adversarial attack is for before starting your research.
And, based on the information you have provided, the torchattacks
does not have any bugs.
And, based on the information you have provided, the
torchattacks
does not have any bugs.
It doesn't do anything and doesn't give any errors either, I would classify that a bug. However, I will look into implementing the attacks on my own instead of using torchattacks
, since it seems like this wasn't optimized on models such as BLIP. I should also mention that BLIP doesn't generate the caption by running it as model()
but it has it's own separate method called model.generate()
, which is the one used to get the caption.
Hello
I have the same situation. Generated attacked image is the same as the input one. I tested FGSM and boundary attack (FAB).
Hello
I have the same situation. Generated attacked image is the same as the input one. I tested FGSM and boundary attack (FAB).
Hi @preste-nakam , can you provide your test code?
✨ Short description of the bug [tl;dr]
Going back to what I mentioned in #154, I am just getting the same image as input for the vast majority of attacks. I didn't try them all, but the only one that seems to work on the blip model is the PGD one.
It doesn't give any errors, it just gives me the same image at the highest epsilon (1) for FGSM attack.
On PGD it does give me a noisy image, so at least it works. But it needs to be quite strong to really affect the blip model
DeepFool also just gives me the same image with no errors. On AutoAttack and Jitter, I do get this error:
💬 Detailed code and results