Explaining and Harnessing Adversarial Examples (FGSM)

NN 모델은 Linear한 특성을 갖음
FGSM 기법 소개
Adversarial training 기법 소개
다른 architectures에 대해 transferability를 갖음
Radial basis function networks 와 같은 non-linear 네트워크는 어느정도 adv attack에 대해 방어 가능
linearity가 cross-model generalization 의 원인
adversarial examples는 high-dimensional dot product 특성으로 설명할 수 있다. adversarial examples는 모델의 선형성 때문에 발생
adversarial examples의 generalization은 서로 다른 모델이 동일한 작업을 하면 비슷한 function을 학습하는 것으로 설명할 수 있다. 즉, 서로 다른 모델이 비슷한 wegith vector를 갖고, adversarial example은 모델의 weight vector에 따라 생성되기 때문
adversarial example은 input space의 특정한 point에 위치하는 것이 아니라 perturbation의 방향에 따라 넓은 공간에 존재
perturbation의 방향이 핵심이기 때문에 adversarial perturbation은 서로 다른 clean example에 대해서 generlize하다.
adversarial training은 reguralization 효과가 있다.
optimize가 쉬운 모델은 perturb하기 쉬움
linear model은 adversarial attack에 resist capacity가 부족하다.
앙상블은 adversarial attack을 defense 하는데 쪼오금 도움을 주지만 아직 부족하다.

But, 왜 l-inifinity norm (max norm)을 사용하는지 아직 잘 모르겠음 -> (Linear한 Model은 High Dimension이라는 가정이 있기 때문에 L1 norm, L2 norm이 아닌 max norm constraint를 적용한다).

" For example, digital images often use only 8 bits per pixel so they discard all information below 1/255 of the dynamic range. Because the precision of the features is limited, it is not rational for the classifier to respond differently to an input x than to an adversarial input x ̃ = x + η if every element of the perturbation η is smaller than the precision of the features. Formally, for problems with well-separated classes, we expect the classifier to assign the same class to x and x ̃ so long as ||η||∞ < ε, where ε is small enough to be discarded by the sensor or data storage apparatus associated with our problem."

https://arxiv.org/pdf/1412.6572.pdf

toriving / Plz_Read_The_Paper

Explaining and Harnessing Adversarial Examples (FGSM) #55