tingxueronghua / pytorch-classification-advprop

MIT License
106 stars 16 forks source link

Key part error #8

Closed jiajinuiuc closed 2 years ago

jiajinuiuc commented 2 years ago

Thanks for sharing your code! I found the key part is confusing. Please correct me if I am wrong. I think the key part (copied as below) is not correct.

During the training, if our setting is mixbn = False, and the attacker= PGD attacker, Then we first generate adversarial image with aux bn (this part is correct). However, following the original paper, the clean images should go through clean bn and adv images go through adv bn. In your code, the clean and adv images are concatenated first and both go through clean bn only.

if training: self.eval() self.apply(to_adv_status) if isinstance(self.attacker, NoOpAttacker): images = x targets = labels else: auximages, = self.attacker.attack(x, labels, self._forward_impl) images = torch.cat([x, aux_images], dim=0) targets = torch.cat([labels, labels], dim=0) self.train()

        if self.mixbn:
            # the DataParallel usually cat the outputs along the first dimension simply,
            # so if we don't change the dimensions, the outputs will be something like
            # [clean_batches_gpu1, adv_batches_gpu1, clean_batches_gpu2, adv_batches_gpu2...]
            # Then it will be hard to distinguish clean batches and adversarial batches.
            self.apply(to_mix_status)
            return self._forward_impl(images).view(2, input_len, -1).transpose(1, 0), targets.view(2, input_len).transpose(1, 0)
        else:
            self.apply(to_clean_status)
            return self._forward_impl(images), targets
    else:
        images = x
        targets = labels
        return self._forward_impl(images), targets
tingxueronghua commented 2 years ago

ok, maybe I do not make it clear. If you want to do AdvProp, please follow the command in the readme, and make sure --mixbn is used. If mixbn is set as false, this is simply a vanilla adversarial training.

Thanks for your advice! Fell free to reply if there is any other problems.

jiajinuiuc commented 2 years ago

Thank you so much for @tingxueronghua ! Yes you're right. I got your point. Can I ask one more question as below? Where did you define the 'dual--batch norm layers' in your code? I could not find certain part of code define this network architecture. Could you please point it out for me? Thanks in advance!

tingxueronghua commented 2 years ago

You are welcome. The MixBatchNorm is defined here https://github.com/tingxueronghua/pytorch-classification-advprop/blob/982dbe96f606336649090092eb6aaa617771be89/imagenet.py#L533.

jiajinuiuc commented 2 years ago

I see! Thank you so much!