THUYimingLi / BackdoorBox

The open-sourced Python toolbox for backdoor attacks and defenses.
GNU General Public License v2.0
470 stars 73 forks source link

Emprical study on effect of `poisoned transform train index` ? #66

Closed Oklahomawhore closed 9 months ago

Oklahomawhore commented 9 months ago

Hi big brother, I'm a graduate student from sjtu doing research on backdoor learning, thank you for the project it is of great help to my study. I've experimented with different timing of injecting backdoor trigger in torchvision transforms previously, and I'm curious that do you have empirical results on the effect of injecting backdoor triggers at different stage of image augmentation?

Thank you very much for your time.

THUYimingLi commented 9 months ago

Hi big brother, I'm a graduate student from sjtu doing research on backdoor learning, thank you for the project it is of great help to my study. I've experimented with different timing of injecting backdoor trigger in torchvision transforms previously, and I'm curious that do you have empirical results on the effect of injecting backdoor triggers at different stage of image augmentation?

Thank you very much for your time.

Hi, thanks for reaching out and using our toolbox! Could you provide more details about 'at different stage of image augmentation'?

Oklahomawhore commented 9 months ago

Sorry for not making it clear! Here is a simplified explanation of my problem, hope it helps!

Say the train transform includes RandomHorizontalFlip and Normalize, and the dataset is CIFAR10, and the attack method is BadNets. From image file to network input, there are a number of points we can patch backdoor trigger onto the image, 截屏2024-02-19 22 03 38 and we can add the trigger at point A, B or C explained below,

Image Transformation Pipeline with Trigger Insertion Points

  1. Original Image

    • The raw image from the CIFAR-10 dataset.
  2. ➤ [Point A] Trigger Insertion (Before RandomHorizontalFlip)

    • At this point, a trigger can be added to the original image before applying any augmentation.
  3. ➤ RandomHorizontalFlip

    • Data augmentation step that randomly flips the image horizontally.
  4. ➤ [Point B] Trigger Insertion (After RandomHorizontalFlip, Before Normalization)

    • Here, the trigger can be added after the RandomHorizontalFlip but before normalization.
  5. ➤ Normalization

    • Standardizing the pixel values of the image.
  6. ➤ [Point C] Trigger Insertion (After Normalization)

    • Finally, a trigger can be added after the normalization process.

There are three choice of trigger insertion points, my problem is that is there any difference between them?

THUYimingLi commented 9 months ago

Sorry for not making it clear! Here is a simplified explanation of my problem, hope it helps!

Say the train transform includes RandomHorizontalFlip and Normalize, and the dataset is CIFAR10, and the attack method is BadNets. From image file to network input, there are a number of points we can patch backdoor trigger onto the image, 截屏2024-02-19 22 03 38 and we can add the trigger at point A, B or C explained below,

Image Transformation Pipeline with Trigger Insertion Points

  1. Original Image

    • The raw image from the CIFAR-10 dataset.
  2. ➤ [Point A] Trigger Insertion (Before RandomHorizontalFlip)

    • At this point, a trigger can be added to the original image before applying any augmentation.
  3. ➤ RandomHorizontalFlip

    • Data augmentation step that randomly flips the image horizontally.
  4. ➤ [Point B] Trigger Insertion (After RandomHorizontalFlip, Before Normalization)

    • Here, the trigger can be added after the RandomHorizontalFlip but before normalization.
  5. ➤ Normalization

    • Standardizing the pixel values of the image.
  6. ➤ [Point C] Trigger Insertion (After Normalization)

    • Finally, a trigger can be added after the normalization process.

There are three choice of trigger insertion points, my problem is that is there any difference between them?

I see. This is an interesting question. Although I do not evaluate it comprehensively (I believe there is no paper did it), I have some analyses, as follows.

  1. For poison-only backdoor attacks, trigger patterns can be added only at point (a) since the adversaries cannot control the training process.
  2. For training-controlled or model-modified attacks, people usually add triggers at point (b) since transformation will change the distribution of 'real' trigger distribution learned by DNNs (notice that only normalization will be used in the inference process!).
  3. In general, I think point (a) is hardest and point (b) is easy for attacking DNNs. However, in most cases, I think the choice has only minor effects to the final attack performance since backdoor is a easily learned model short-cut. But in some hard-learning cases (e.g., one-pixel trigger), it may make difference.
Oklahomawhore commented 9 months ago

Thank you very much for your great insight, it's of tremendous help!