Open ideasplus opened 12 months ago
@Wilmido @ycz11 any updates?
Sorry for replying late, Your point is very valid, but PointCRT does not require the knowledge of the backdoor attack types in advance.
The purpose of training a classifier is to let the classifier know what a clean sample is, and all other samples are considered as backdoor samples. So we don't need to known backdoor samples as positive samples. As shown in Figure 6, we have actually conducted transferability experiments, and concluded that classifiers trained by transformation-based backdoor triggers as known backdoor attacks exhibits excellent transferability.
I hope the above response is helpful to you.
@Wilmido Thanks for your response! Since I just had a quick look at the paper, I may have missed some details.
I have one more question: In your main experiments instead of the ablation study, do you train a series of classifiers on each to-be-detected attack or train a classifier on a default attack type and then detect all attack types?
We rigorously conducted the main experiments on TeCo and SCALE-UP following their original repositories. They are conducted by the known backdoor attack. To be honest, we do not notice this problem. However, this setting ensures the fairest comparison with these methods. As you can see, directly applying their predefined thresholds from 2D images domain would be much more inappropriate. But, I personally, also agree that this seems tricky.
@Wilmido Thanks for your response and sorry for the late reply.
If I understand correctly, both TeCo and SCALE-UP claim that they do not have any prior information about the backdoor attack. They should just need the clean samples to determine the detector's threshold. I haven't looked at their code yet, do you mean they also need known poisoned samples to determine the threshold?
Please correct me if I'm wrong. Thank you.
Indeed. The ROC evaluation code, including both TeCo and SCALE-UP, compares the predicted labels with the ground-truth labels (whether it is a backdoor sample) by employing corresponding functions sklearn.metrics.roc_curve
to obtain thresholds.
from sklearn import metrics
fpr, tpr, thresholds = metrics.roc_curve(y, pred, pos_label=1)
print(metrics.roc_auc_score(y, pred))
So, you won't find the definition of "thresholds" in their repository codes because they are not manually selected! Our approach, in contrast, involves training a classifier to detect backdoor samples, eliminating the need for hyperparameter selection. However, it must be acknowledged that the requirement for a backdoor attack to train it is unavoidable, as you pointed out earlier.
Ok, I see. Thanks for your kind reply.
Hello,
I have a question after reading your paper. Does your defense require knowing the attack type in advance? PointCRT needs to train a classifier to distinguish clean and backdoor samples, making it attack-dependent. However, I think such a defense assumption is unreasonable.
Could you help me solve this issue? Looking forward to your reply.