Harry24k / adversarial-attacks-pytorch

PyTorch implementation of adversarial attacks [torchattacks].
https://adversarial-attacks-pytorch.readthedocs.io/en/latest/index.html
MIT License
1.79k stars 337 forks source link

[BUG] Getting same image as input #155

Closed Tenpi closed 5 months ago

Tenpi commented 1 year ago

✨ Short description of the bug [tl;dr]

Going back to what I mentioned in #154, I am just getting the same image as input for the vast majority of attacks. I didn't try them all, but the only one that seems to work on the blip model is the PGD one.

It doesn't give any errors, it just gives me the same image at the highest epsilon (1) for FGSM attack.

On PGD it does give me a noisy image, so at least it works. But it needs to be quite strong to really affect the blip model

DeepFool also just gives me the same image with no errors. On AutoAttack and Jitter, I do get this error:

acc = self.get_logits(x).max(1)[1] == y
IndexError: Dimension out of range (expected to be in range of [-1, 0], but got 1)

💬 Detailed code and results

def load_blip_image(image, dim):
    global width 
    global height
    raw_image = Image.open(image).convert("RGB")
    width, height = raw_image.size
    transform = transforms.Compose([
        transforms.Resize((dim, dim), interpolation=transforms.InterpolationMode.BICUBIC),
        transforms.ToTensor()
    ])
    return transform(raw_image).unsqueeze(0).to(device)

def blip(input, output, attack = "pgd", epsilon = 10/255):
    global model
    model = blip_module.blip_decoder(pretrained=os.path.join(dirname, "models/blip/blip.pt"), image_size=384, vit="base")
    model.eval()
    model.to(device)
    img = load_blip_image(input, 384)
    atk = torchattacks.PGD(model, eps=epsilon)
    if attack == "fgsm":
        atk = torchattacks.FGSM(model, eps=epsilon) #doesn't work
    if attack == "deepfool":
        atk = torchattacks.DeepFool(model) #doesn't work
    if attack == "autoattack":
        atk = torchattacks.AutoAttack(model) #doesn't work
    if attack == "jitter":
        atk = torchattacks.Jitter(model) #doesn't work
    adv_image = atk(img, torch.tensor([1.0]))
    save_image(adv_image[0], output)
    img2 = Image.open(output)
    img2 = resize(img2, (width, height))
    img2.save(output)
    return img2
rikonaka commented 1 year ago

Hi @Tenpi , first question, why do you need to use adversarial attacks? Second question, how do you achieve this? The third question, have you learned about adversarial attacks?

Tenpi commented 1 year ago

I want to use it so that the blip model captions the image differently. I already posted the code I used to achieve the image, I'm not sure how well this library can be used on different models, as I explained before I had to modify the forward function to ignore the caption parameter and expand the dims of the return tensor. I know that adversarial attacks can be used to confuse a model using only noise which is barely visible.

The full code of BLIP (model I'm using) is here: https://github.com/salesforce/BLIP

rikonaka commented 1 year ago

I want to use it so that the blip model captions the image differently. I already posted the code I used to achieve the image, I'm not sure how well this library can be used on different models, as I explained before I had to modify the forward function to ignore the caption parameter and expand the dims of the return tensor. I know that adversarial attacks can be used to confuse a model using only noise which is barely visible.

The full code of BLIP (model I'm using) is here: https://github.com/salesforce/BLIP

The adversarial attack requires that the model being attacked is a classifier (input images, output labels) rather than a transfomer, FGSM and PGD both compute adversarial examples based on classifier gradients, others like the CW attack have their own optimisation methods, the essence of which is still to get the original benign example across the decision boundary as short as possible, I think you should read a few papers on basic adversarial attacks instead of just knowing what the adversarial attack is for before starting your research.

And, based on the information you have provided, the torchattacks does not have any bugs.

Tenpi commented 1 year ago

And, based on the information you have provided, the torchattacks does not have any bugs.

It doesn't do anything and doesn't give any errors either, I would classify that a bug. However, I will look into implementing the attacks on my own instead of using torchattacks, since it seems like this wasn't optimized on models such as BLIP. I should also mention that BLIP doesn't generate the caption by running it as model() but it has it's own separate method called model.generate(), which is the one used to get the caption.

preste-nakam commented 10 months ago

Hello

I have the same situation. Generated attacked image is the same as the input one. I tested FGSM and boundary attack (FAB).

rikonaka commented 10 months ago

Hello

I have the same situation. Generated attacked image is the same as the input one. I tested FGSM and boundary attack (FAB).

Hi @preste-nakam , can you provide your test code?