Trusted-AI / adversarial-robustness-toolbox

Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams
https://adversarial-robustness-toolbox.readthedocs.io/en/latest/
MIT License
4.88k stars 1.17k forks source link

L¹ `FGM` is wrong + extend to all p >= 1 #2381

Open ego-thales opened 10 months ago

ego-thales commented 10 months ago

Hello,

I'm not sure, but I think the FGM extension to $L¹$ norm is not correct.

From what I can read here, it seems to me that the current version implements (essentially) $$\text{noise direction}=\frac{\nabla}{\Vert\nabla\Vert_1}$$ when $$\text{noise direction}=(0, \dots, 0, \text{sign}(a_i), 0, \dots, 0),\quad(i=\text{argmax}_j\vert\nabla_j\vert)$$ gives a higher inner product $\langle\nabla,\text{noise direction}\rangle$ for the same $L¹$ budget.

Indeed, in both cases $\Vert\text{noise direction}\Vert_1=1$ while the first and second options respectively give $\Vert\nabla\Vert_2/\Vert\nabla\Vert1$ and $\Vert\nabla\Vert{\infty}$. The latter is of course bigger due to Hölder's inequality.


Edit: See here for generalization to all $p\in[1, +\infty]$.

beat-buesser commented 10 months ago

Hi @ego-thales Thank you for this comment. Without deciding on the correctness yet, how did you notice this issue? Have you already checked which version the literature on FGM is using?

ego-thales commented 10 months ago

Thanks for your answer,

I've stumbled upon this because while reading FGSM paper (the reference for implementation), I thought about generalizing to $L^p$ norms. Then I saw that this repo implemented $L¹$ and $L^2$ extensions specifically, so I went and checked out the code (since there is no cited source regarding the maths used) and noticed this (apparently) suboptimal implementation.

ego-thales commented 10 months ago

Actually, now that I think about it, I don't see any reason why this attack is not generalized to any $L^p$ noise.

Let $p\in[1, +\infty]$ and $q$ such that $\frac{1}{p}+\frac{1}{q}=1$ (some abuse of notations will occur when $p=1$ or $p=+\infty$). With

$$\text{noise direction}:=\left(\frac{\vert\nabla\vert}{\Vert\nabla\Vert_q}\right)^{q/p}\text{sign}(\nabla),$$ one gets:

As such, it would be a nice addition to entirely generalize FGM to all $p\geq 1$.

beat-buesser commented 10 months ago

Hi @ego-thales Thank you very much for the explanation and pull request! Let me take a closer look at the required changes. Related to this issue in FGSM, what do you think about the perturbation per iteration and overall perturbation calculation for p=1 in the Projected Gradient Descent attacks in art.attacks.evasion.projected_gradient_descent.*?

eliegoudout commented 10 months ago

I'm not entirely sure but it looks to me after a quick glance that PGD was implemented as a subs class of FGSM and inherits its loss from it.