xuanqing94 / BayesianDefense

Adv-BNN: Improved Adversarial Defense through Robust Bayesian Neural Network
MIT License
61 stars 12 forks source link

VGG with Variational Inference not training #2

Closed kumar-shridhar closed 5 years ago

kumar-shridhar commented 5 years ago

Hi, I am training a VGG VI network on CIFAR-10 and the validation accuracy remains very low (20%) even after training for 200 epochs. The model was overfitting with training accuracy reaching past 70. I put L2 regularization (weight_decay) in the optimizer but still no increase in validation accuracy. Is there a reason for it? What am I doing wrong here? I used all the default parameters.

Thanks, Kumar

kumar-shridhar commented 5 years ago

Also, training the adversarial VI code gives the same results as opposed to what has been proposed in the paper. image

How to replicate the results? Is there something that needs to be done pre-hand?

xuanqing94 commented 5 years ago

Just want to do a quick check: can you download the checkpoints provided in the Google drive link?

kumar-shridhar commented 5 years ago

Ok. I will do it now. I trained it from scratch.

kumar-shridhar commented 5 years ago

Hi. From the checkpoint (cifar_vgg_vi), I got 91% on CIFAR10 test data. Why is it that I cannot train from scratch and reach the same accuracy? How was the checkpoint created? Were there some hyper-parameters settings done?

xuanqing94 commented 5 years ago

The hyper-parameters are all listed in the appendix, how do you run the script? because I didn't list the bash command to run cifar_vgg_vi in README, it is possible that parameters are not the same.

You might also try to run the script multiple times and pick a good one, although in my experience I don't find it necessary.

kumar-shridhar commented 5 years ago

I used the following hyper-parameters: lr=0.01 sigma_0=0.15 init_s=0.15

xuanqing94 commented 5 years ago

@kumar-shridhar Then we are using different setting, please check the parameters in appendix B.

kumar-shridhar commented 5 years ago

Hi @xuanqing94 , I used the checkpoints directly (cifar10_vgg_adv_vi.pth) and ran a fgsm attack (the simplest one) and seems like the model behaves same as a frequentist one. Here are my results: image

xuanqing94 commented 5 years ago

@kumar-shridhar Seems that you didn't do aggregation. See ./acc_under_attack.py. Also, under high distortions (>0.1), no model does well (~10%). I suggest following the settings in the paper and code.

shamoons commented 5 years ago

Where do you put the CIFAR-10 data? And which one did you download? The one for Python?