Azure / counterfit

a CLI that provides a generic automation layer for assessing the security of ML models
MIT License
799 stars 128 forks source link

Why ['initial']['input'] and ['final']['input'] are the same? #44

Open jiansuozhe opened 2 years ago

jiansuozhe commented 2 years ago

Hello @moohax , I found in your wiki that evasion attack attempts to alter inputs such that the model gives an incorrect output. However, when I trying with evasion attack hop skip jump, I found that ['initial']['input'] and ['final']['input'] are the same, I thought they should be different from each other right? Additionally, although the two inputs are the same, ['initial']['output'] and ['final']['output'] are different from each other, could you please tell me the reason? I cannot find the place where ['initial']['output'] and ['final']['output'] comes from. Thank you.

moohax commented 2 years ago

The old version you had to define y in the target. For example, if your sample was a picture of a cat, you'd need know that target model would label it. This wasn't ideal because of the blackbox nature of our assessments, so we first ask the model to label our sample and go from there.