inability to achieve results

tanlingp commented 1 year ago

The inception model I reproduced couldn't do what you did. We usually have 229✖229 as input for that model, here it is 224✖224. does this have any effect please? Looking forward to your reply.

WindVChen commented 1 year ago

Hi @tanlingp,

Could you provide more details regarding what you meant by "The inception model I reproduced couldn't do what you did"? Typically, Inception models expect 299x299 input resolution. However, due to constraints with the Stable Diffusion model (failed to handle an odd number), we conducted our experiments at a resolution of 224x224.

Concerning the impact of this resolution change, it may introduce minor fluctuations in results but should not significantly affect the overall experiment conclusions.

If you require a 299x299 output, you might consider an alternative approach: first generate an image at a resolution compatible with Stable Diffusion (e.g., 304), then resize it to 299x299.

Hope this helps.

tanlingp commented 1 year ago

This inception model is 6% less attackable than your paper for resnet50 and vgg19

WindVChen commented 1 year ago

Could you provide more details, like input resolution, specifics about the Inception model (e.g., whether it's the PyTorch default), and any other relevant hyperparameters?

tanlingp commented 1 year ago

The experimental setup is all according to the code you provided. Normally the inception model input seems to be 229✖229, here it is 224✖224, what kind of setup do you have please!

WindVChen commented 1 year ago

This seems unusual. 🤔 We'll retest the code in this repository to check if there is any potential bug caused by the code cleanup phase. Stay tuned for updates.

WindVChen commented 1 year ago

Hi @tanlingp,

I've re-run the code in this repository, and it appears to be functioning correctly. To expedite the process, I divided the 1000 images into 8 parts and executed them on an 8-4090-GPU server. The only modifications I made were to the parameters "images_root," "label_path," and "pretrained_diffusion_path" in main.py to point to my local dataset/pretrained weight paths. The results are as follows:

*********Transfer to resnet********
Accuracy on benign examples: 92.7%
Accuracy on adversarial examples: 60.099999999999994%

*********Transfer to vgg********
Accuracy on benign examples: 88.7%
Accuracy on adversarial examples: 56.89999999999999%

*********Transfer to mobile********
Accuracy on benign examples: 86.9%
Accuracy on adversarial examples: 55.60000000000001%

*********Transfer to inception********
Accuracy on benign examples: 80.60000000000001%
Accuracy on adversarial examples: 13.5%

*********Transfer to convnext********
Accuracy on benign examples: 97.0%
Accuracy on adversarial examples: 77.5%

*********Transfer to vit********
Accuracy on benign examples: 93.7%
Accuracy on adversarial examples: 73.9%

*********Transfer to swin********
Accuracy on benign examples: 95.89999999999999%
Accuracy on adversarial examples: 74.8%

*********Transfer to deit-b********
Accuracy on benign examples: 94.5%
Accuracy on adversarial examples: 77.3%

*********Transfer to deit-s********
Accuracy on benign examples: 94.0%
Accuracy on adversarial examples: 72.2%

*********Transfer to mixer-b********
Accuracy on benign examples: 82.5%
Accuracy on adversarial examples: 58.599999999999994%

*********Transfer to mixer-l********
Accuracy on benign examples: 76.5%
Accuracy on adversarial examples: 55.60000000000001%

*********Transfer to tf2torch_adv_inception_v3********
Accuracy on benign examples: 84.39999999999999%
Accuracy on adversarial examples: 45.2%

*********Transfer to tf2torch_ens3_adv_inc_v3********
Accuracy on benign examples: 79.4%
Accuracy on adversarial examples: 43.2%

*********Transfer to tf2torch_ens4_adv_inc_v3********
Accuracy on benign examples: 78.7%
Accuracy on adversarial examples: 43.4%

*********Transfer to tf2torch_ens_adv_inc_res_v2********
Accuracy on benign examples: 90.3%
Accuracy on adversarial examples: 58.4%
clean_accuracy:  92.7   88.7    86.9    80.60000000000001   97.0    93.7    95.89999999999999   94.5    94.0    82.5    76.5    84.39999999999999   79.4    78.7    90.3
adv_accuracy:  60.099999999999994   56.89999999999999   55.60000000000001   13.5    77.5    73.9    74.8    77.3    72.2    58.599999999999994  55.60000000000001   45.2    43.2    43.4    58.4

*********fid: 62.1413041747204********

Due to differences in the environment, the results may not perfectly align with those in the paper, but they are generally similar.

tanlingp commented 1 year ago

Okay, thanks for the reply. It could be a matter of environmental differences

WindVChen commented 1 year ago

This inception model is 6% less attackable than your paper for resnet50 and vgg19

🤔 I'm not entirely convinced that the environment difference alone would account for such a significant 6% variation in your reproduction (based on the above results, it seems to cause only minor changes). Please keep me posted if you uncover any other factors.

tanlingp commented 1 year ago

I will continue to look at the issue. Will contact you if I find it.

WindVChen / DiffAttack

inability to achieve results #8