alibaba / easyrobust

EasyRobust: an Easy-to-use library for state-of-the-art Robust Computer Vision Research with PyTorch.
Apache License 2.0
323 stars 37 forks source link

About the information of BN layer. #16

Open xumingyu2021 opened 1 year ago

xumingyu2021 commented 1 year ago

It seems that the method doesn't use the information of BN layer in the code of BATS. Maybe I miss something. In addition, I'm curious about the selection of hyperparameter, such as lam = 1.05 for ImageNet.

ZY123-GOOD commented 11 months ago

You've raised a good question. Batch Normalization (BN) plays a supportive role here by aiming to estimate the mean and variance of the feature distribution. In practice, statistical methods can also be employed to obtain the mean and variance of features. The selection of hyperparameters is indeed quite subtle, as I've observed that optimal hyperparameters vary across different models. I would like to share with you another piece of our work titled "Rethinking Out-of-Distribution Detection From a Human-Centric Perspective" (https://arxiv.org/abs/2211.16778). This work indicates that the algorithm's performance is significantly influenced by the model structure and parameters. This observation might explain why optimal hyperparameters differ across various models. It also reveals the current difficulty in constructing a cross-model universal detection algorithm. I hope my response proves helpful. If you have any further questions, please feel free to email me at ee_zhuy@zju.edu.cn or chat with me on WeChat.

xumingyu2021 commented 11 months ago

Thanks for your reply! It helps me a lot. Indeed, using statistical methods to obtain the mean and variance of features can be more practical. I think perhaps using quantiles instead of mean/variance might be also a good choice. Of course, there might be more suitable statistics. I quite agree with your viewpoint in "Rethinking Out-of-Distribution Detection From a Human-Centric Perspective" that "model architectures and training regimes matter in OOD detection and should be considered integral when designing new detection methods." Perhaps future OOD detection methods require insensitive hyperparameters (if any), or can reveal the relationship between hyperparameters and network architectures/model training.

ZY123-GOOD commented 11 months ago

I agree with you. After realizing the significant impact of the model’s parameters on detection algorithms, I have been recently reflecting on whether there are algorithms that are insensitive to hyperparameters or even delving into how much contribution post-hoc detection methods make to security. Feel free to contact me for further discussion.