Closed Adversarian closed 6 months ago
Hi @Adversarian , I have conducted several tests so far, but I currently do not have a normalized model, so I cannot conduct related tests yet π₯². My result is that CWBS (or CWBSL2) can achieve an attack success rate of 100% on 100 images.
atk = CWBS(model, init_c=1, steps=10, lr=0.01, binary_search_steps=10, abort_early=False)
And if I set abort_early=True
atk = CWBS(model, init_c=1, steps=10, lr=0.01, binary_search_steps=10, abort_early=True)
So far, the performance is normal. I will try the normalized model next π¨.
These were my hyperparameters although I did change them around quite a bit:
"CW": {
"kappa": 14,
"init_c": 1,
"binary_search_steps": 10,
"steps": 100,
"lr": 1e-2,
}
I suspect it might have something to do with normalization but I'm still afraid I might be doing something wrong here. I will try to create a minimal reproducible example when I can.
These were my hyperparameters although I did change them around quite a bit:
"CW": { "kappa": 14, "init_c": 1, "binary_search_steps": 10, "steps": 100, "lr": 1e-2, }
I suspect it might have something to do with normalization but I'm still afraid I might be doing something wrong here. I will try to create a minimal reproducible example when I can.
I just trained a Resnet18 model, and the attack success rate of CWBS is still 100% π¨.
model = load_model()
model.eval()
total = 0
success = 0
# atk = CWL0(model, c=1, steps=10, lr=0.01, abort_early=True)
# atk = CW(model, c=1, steps=10, lr=0.01, abort_early=True)
# atk = CWLinf(model, c=1, steps=10, lr=0.01, abort_early=True)
# atk = CWBS(model, init_c=1, steps=10, lr=0.01, binary_search_steps=10, abort_early=False)
atk = CWBS(model, init_c=1, steps=10, lr=0.01, binary_search_steps=10, abort_early=True) # nopep8
atk.set_normalization_used(mean=(0.4914, 0.4822, 0.4465), std=(0.247, 0.243, 0.261)) # nopep8
with tqdm(total=len(test_loader), desc='Test') as tbar:
for batch_idx, (x, y) in enumerate(test_loader):
if total > 1000:
break
x, y = x.to(device), y.to(device)
total += y.shape[0]
adv_images = atk(x, y)
adv_pred = model(adv_images)
# printA(labels)
# print(torch.argmax(adv_pred, 1))
success += torch.sum(y != torch.argmax(adv_pred, 1))
tbar.update()
success_rate = success / total
print("Attack success rate: {:.3f}".format(success_rate))
Maybe there are some problems in your code.
Thanks you for your time! I'm going to try to create a working example and send it here. Until then I will close the issue as your testing is sufficient evidence that something must be going wrong in my code.
Sorry for reopening this isse so early but one thing in your snippet jumped out to me. Can you please show me how you've constructed your testloader
? I'm asking this because I'm seeing that you're passing adv_images
to the victim model without normalization.
I've been normalizing the outputs of the attacks with the precalculated mean
and std
before passing them onto the victim model which I think is correct because the robust accuracy improves significantly after doing this which leads me to believe that the outputs of the attacks are not normalized. Am I making a mistake here?
Never mind, I think I see where I've been going wrong. I've assumed that set_normalization_used
means that I will have to pass in unnormalized images to the attacker but after reviewing the code I understand that I have to actually input already normalized images to the attacker when using this and the attacker will return normalized outputs which I can pass into the model.
I will make these changes to my code and hopefully this time the issue will stay closed π.
I'm here to report that my issue was successfully resolved. I'm leaving this comment here in case anyone else finds themselves in a similar situation as me. If you've migrated to Torchattacks from Foolbox, they exhibit different behaviors with regards to normalization which is why I was struggling with this in the first place.
When you use the set_normalization_used
method of an attack on TA, the attacker expects an already normalized input. Here's what happens when you pass a set of images to the attacker when set_normalization_used
is called:
adv_images
) are obtained.adv_images
is clamped to the (0, 1) range.mean
and std
you set when calling set_normalization_used
.So essentially, when set_normalization_used
is called, the model expects an already normalized input whose inverse normalization lies in the (0, 1) and it returns a batch of normalized adversarial images which you won't have to later normalize again before inputting to your victim.
β Any questions
I didn't file this under [BUG] because I'm certain there's a problem with my use of the library rather than the library itself. I am try to create adversarial samples using the CW attack method but I am not having great success with it unfortunately.
I am using the
CWBSL2
method from @rikonaka's PR which I think has the same underlyingCW
implementation as the base library only with an added binary search component.Things I've tried:
torchattacks.CW
as well.I have a simple ResNet18 victim model on CIFAR10 which uses normalization. I have made sure to use the
set_normalization
method of the attack and my input images are sampled directly from CIFAR10 with only aToTensor()
transform applied on top of them to bring them to (0, 1) range.Since my code is part of a private repository I cannot share my own code but I might be able to cook up a snippet to reproduce this if need be.
The robust classification accuracy is around 89%. This is while I'm getting 0.25 and 0.0 for FGSM and PGD respectively using the same exact piece of code.