fra31 / auto-attack

Code relative to "Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks"
https://arxiv.org/abs/2003.01690
MIT License
639 stars 111 forks source link

Create attack state #98

Closed dedeswim closed 1 year ago

dedeswim commented 1 year ago

This PR introduces a state for the attack so that the results can be restored in case the evaluation gets interrupted, and the state can be resumed (e.g., because the resources used for the evaluation are preempted).

The state is a dataclass which saves the following information:

It's not necessary to save the robust accuracy, as it can be easily computed by doing torch.sum(robust_flags).item() / robust_flags.shape[0].

To save and restore the state, it's enough to pass the path where the state has to be (or is already) saved to AutoAttack.run_standard_evaluation.

fra31 commented 1 year ago

Hi,

thanks a lot for the contribution, it looks great! I'd just add a warning that, in case the evaluation restarts from a saved state, the output might not contain all the adversarial images which have been found (since these are not part of the state).

dedeswim commented 1 year ago

Hi! Thanks for the reply. Just to check: you mean to add a warning in the example script, i.e., here?

dedeswim commented 1 year ago

Meanwhile, I added it, let me know if this is what you meant 😊

fra31 commented 1 year ago

I was rather thinking to somewhere like here (at the beginning if state_path is not None) or here (at the end of the evaluation if state_path is not None), maybe both. I would add it in autoattack.py itself, so that it's independent of the example. Does it make sense to you?

The message looks good to me, just a minor edit below, I'd just use self.logger.log to print them, so that they get potentially saved to the main log file as well (similarly for this).

Since a state path is provided, the saved adversarial examples are **only** those obtained in the latest run of the attack.
dedeswim commented 1 year ago

I see. Then I would refer not to the saved adversarial examples, but to those returned by the attack. What do you think?

dedeswim commented 1 year ago

Fixed. I have also changed a couple more print statements I found to self.logger.log. Would it make sense to mention in the warning that adversarial accuracy can be computed with state.robust_flags.mean()? Or should I create a @property method inside of EvaluationState that computes it?

dedeswim commented 1 year ago

I added the property, but if a user calls it before all the attacks have been run, then it warns the user that they are checking it before all the attacks are finished. Moreover, I also added a check that the attacks to run match between the AutoAttack object and the restored state. This should prevent some errors by users loading the wrong state file

fra31 commented 1 year ago

Great, thanks again!