Trusted-AI / adversarial-robustness-toolbox

Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams
https://adversarial-robustness-toolbox.readthedocs.io/en/latest/
MIT License
4.87k stars 1.17k forks source link

change docstring for `SimBa.generate` parameter `y` #1407

Open Embeddave opened 2 years ago

Embeddave commented 2 years ago

Is your feature request related to a problem? Please describe. Currently the docstring for Simba.generate states that y should be class labels (as is true for most art attacks).

https://github.com/Trusted-AI/adversarial-robustness-toolbox/blob/85ca10ff6b696dbaa66dcf179d0493b12ff594bb/art/attacks/evasion/simba.py#L104

But actually in this case the SimBA.generate method expects a (1, c) matrix of class probabilities

Describe the solution you'd like I understand there's trade-offs when trying to provide a consistent API, but it's definitely confusing that the docstring disagrees with what the code expects.

Maybe keep the argument name but change the docstring?

Describe alternatives you've considered

Not sure if either of those is better, just putting alternatives out there.

beat-buesser commented 2 years ago

Hi @Embeddave Thank you very much for your proposal, let me take a closer look.

beat-buesser commented 2 years ago

@Embeddave I think you have found a bug. The algorithm should be ok, but it works only on a single sample, which is also reflected but requiring batch_size=1. If more than one sample is provided in x and y provided to generate it only considers the fist sample.

I think we need to add a loop over all samples in x to generate adversarial examples for all input samples.

beat-buesser commented 2 years ago

@Embeddave May I ask you for a review and test of PR #1422 aiming to fix this issue?

Embeddave commented 2 years ago

Thank you for working on this. Replied on the PR -- I could review and test early next week if that would be ok

Embeddave commented 2 years ago

Hi @beat-buesser

I see now that might have gotten lost in the way I described the issue here. I'm sorry I was unclear.

You're correct that it's a bug (and confusing!) that the original code was written to only accept batch size of 1, and it's good you are fixing.

But to me, what is more confusing is that the attack specifically expects probabilities from the model. To me that's not very clear from the argument names, or the docstrings, or code.

At first I naively used an estimator that returns logits, and found that the attacks take much longer to run. In some cases I think it may fail silently, but I have not verified that.

I found that I needed to add a SoftMax layer on to my torch model to get the attack to succeed in any reasonable length of time. Which should have been obvious, from how the attack is derived in the paper, but it's not clear from the art code.

It's clearer (to me at least) from the original implementation, where the variable is explicitly named y_prob.

That's what I meant when I suggested possibly changing the parameter name or definition in the docstring. Another option may be to alias within the generate function so it's explicity

y_prob = y

Even the tests just use the standard white-box model that returns "logits" instead of probabilities, as far as I can tell. This threw me off because I looked to the tests to determine whether I was running the attack correctly. If the intent of tests is to do more than a smoke test, and actually verify the attack succeeds, maybe a black box estimator should be used as a fixture? And/or a standard Estimator that has been modified to return probabilities?

Whatever could be done to make clear that a model is needed that returns probabilities might be good for future users of the attack

beat-buesser commented 2 years ago

Hi @Embeddave Thank you, that's very valuable feedback! I have added an additional commit to #1422 to integrate your proposal. What do you think?