hendrycks / ss-ood

Self-Supervised Learning for OOD Detection (NeurIPS 2019)
MIT License
264 stars 31 forks source link

Inconsistent pixel range values #8

Closed pratik18v closed 4 years ago

pratik18v commented 4 years ago

In the adversarial training code, the input to the model is in range (-1, 1). However, in the attack code the pixel values are clipped in range (0,1). Seems like a bug to me, unless I am missing something.

hendrycks commented 4 years ago

We optimize in [0,1] and repeatedly convert it to [-1,1] on line 61. https://github.com/hendrycks/ss-ood/blob/master/adversarial/attacks.py#L61

pratik18v commented 4 years ago

Thank you for the response. I am using your code in the 'adversarial' directory to replicate the numbers in table 1 of the paper. To get the standard adversarial training numbers I modified https://github.com/hendrycks/ss-ood/blob/master/adversarial/train.py as follows:

  1. Set attack_rotations = False in the attack definition
  2. Replace https://github.com/hendrycks/ss-ood/blob/master/adversarial/train.py#L135-L142 with bx, by = bx.cuda(), by.cuda()
  3. Set by_prime = None
  4. Comment out https://github.com/hendrycks/ss-ood/blob/master/adversarial/train.py#L153

However, I am getting similar robustness numbers for both the cases (with and without rotation loss). Can you please tell me how one can replicate the numbers in table 1 of your paper?

hendrycks commented 4 years ago

Are you still using rotations in any way? If so, then it's not standard adversarial training.

pratik18v commented 4 years ago

I don't think I am using rotations at all in my standard adversarial training script. Based on the changes above, I am never generating rotated images or using the rotation loss.

hendrycks commented 4 years ago

What's your adversarial accuracy?

On Thu, Feb 20, 2020 at 12:19 PM Pratik Vaishnavi notifications@github.com wrote:

I don't think I am using rotations at all in my standard adversarial training script. Based on the changes above, I am never generating rotated images or using the rotation loss.

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/hendrycks/ss-ood/issues/8?email_source=notifications&email_token=ACZBITUYY4KGNWYSPTQ6B6DRD3QU5A5CNFSM4KVMVC4KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEMP5M3A#issuecomment-589289068, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACZBITUGSNRYEL3DUFWMD53RD3QU5ANCNFSM4KVMVC4A .

pratik18v commented 4 years ago

I am getting around 50% for both cases, for a 10-step PGD attack (eps = 8/255, step size = 2/255)

hendrycks commented 4 years ago

That simply cannot be. Try the code from https://drive.google.com/file/d/16B7Bt-lqQlD24XTmMip40D077VX-Lc5L/view?usp=sharing for normal adversarial training.

On Thu, Feb 20, 2020 at 12:42 PM Pratik Vaishnavi notifications@github.com wrote:

I am getting around 50% for both cases, for a 10-step PGD attack (eps = 8/255, step size = 2/255)

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/hendrycks/ss-ood/issues/8?email_source=notifications&email_token=ACZBITTOCARAYJWSTVSC3H3RD3TM7A5CNFSM4KVMVC4KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEMP7Y3A#issuecomment-589298796, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACZBITS7L4LOVW2T5KU65ODRD3TM7ANCNFSM4KVMVC4A .

pratik18v commented 4 years ago

Thank you for taking out time to answer my queries and sharing your code, I really appreciate it. I tried the code that you provided for baseline, and re-ran the code for the rotation model and I got the following results:

[Baseline] {'batch_size': 128, 'dataset': 'cifar10', 'decay': 0.0005, 'device': 1, 'droprate': 0.0, 'epochs': 100, 'layers': 40, 'learning_rate': 0.1, 'load': 'snapshots/baseline/', 'model': 'wrn', 'momentum': 0.9, 'prefetch': 2, 'save': './snapshots/baseline', 'test': True, 'test_bs': 200, 'widen_factor': 2, 'test_loss': 1.5106859993934632, 'test_accuracy': 0.4858}

[Rotation model] {'batch_size': 128, 'dataset': 'cifar10', 'decay': 0.0005, 'device': 0, 'droprate': 0.0, 'epochs': 100, 'layers': 40, 'learning_rate': 0.1, 'load': 'snapshots/rot_five', 'model': 'wrn', 'momentum': 0.9, 'prefetch': 4, 'save': './snapshots/rot_five', 'test': True, 'test_bs': 200, 'widen_factor': 2, 'test_loss': 1.4669106388092041, 'test_accuracy': 0.4916}

I trained both the models against a 10-step PGD attack and the results I am sharing are for a 20-step attack on the said model.

hendrycks commented 4 years ago

I would not be surprised if I uploaded the wrong rotation code. But I am surprised that the baseline is 48.6%. If that's a 40-2 WRN, then I'd expect around 44.8%. Even a much larger 28-10 network https://github.com/MadryLab/cifar10_challenge should only be 47.0%.

pratik18v commented 4 years ago

Yeah I have the same concern, however both my code and your code for the baseline is returning the same numbers. Maybe a bug elsewhere (like the attack code)? Do you get 44.8% when you run the baseline code (the one you shared) on your machine? The other potential issue, like you said, might be with the rotation code. For example, you have not uploaded the wrn_prime.py code file in the adversarial folder. If the only difference between wrn.py and wrn_prime.py is the return values of the forward function (https://github.com/hendrycks/ss-ood/blob/master/adversarial/models/wrn.py#L96) then that's nothing to worry about. Please let me know if you figure out what the issue is, as I am hoping to use your proposed model for my current research work. Thank you!

hendrycks commented 4 years ago

I'm not sure I used that rotation code, but I often use the vanilla code in the file I sent. I'd be surprised if my baseline implementation is somehow a few percent better than the rest of the community's. I don't have reason to suspect the attack code. The fact that the baseline is so strong is a mystery to me.

On Thu, Feb 27, 2020 at 3:06 PM Pratik Vaishnavi notifications@github.com wrote:

Yeah I have the same concern, however both my code and your code for the baseline is returning the same numbers. Maybe a bug elsewhere (like the attack code)? Do you get 44.8% when you run the baseline code (the one you shared) on your machine? The other potential issue, like you said, might be with the rotation code. For example, you have not uploaded the wrn_prime.py code file in the adversarial folder. If the only difference between wrn.py and wrn_prime.py is the return values of the forward function ( https://github.com/hendrycks/ss-ood/blob/master/adversarial/models/wrn.py#L96) then that's nothing to worry about. Please let me know if you figure out what the issue is, as I am hoping to use your proposed model for my current research work. Thank you!

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/hendrycks/ss-ood/issues/8?email_source=notifications&email_token=ACZBITR67A5N4PAW3XYN6ADRFBBNRA5CNFSM4KVMVC4KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOENGJXXY#issuecomment-592223199, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACZBITR3XFIBMNBGOLDWNITRFBBNRANCNFSM4KVMVC4A .

Chaimmoon commented 4 years ago

Also face the same issue, can't solve it.

hendrycks commented 4 years ago

Hi, I changed https://github.com/hendrycks/ss-ood/blob/master/adversarial/attacks.py#L64 We let lambda = 0.5 during training, and since we're taking a sum in the attack code, we divide by 8. Let me know if that helps.

Chaimmoon commented 4 years ago

Hi,

Your advice doesn't address the problem of normal adversarial training performance to the 20-step PGD. In my experiment, the finial AC of normal adversarial training to the 20-step PGD is around 48%, the same as @pratik18v , maybe there is some problem with the code or the result.

Best, Mu