Open xumingyu2021 opened 1 year ago
You've raised a good question. Batch Normalization (BN) plays a supportive role here by aiming to estimate the mean and variance of the feature distribution. In practice, statistical methods can also be employed to obtain the mean and variance of features. The selection of hyperparameters is indeed quite subtle, as I've observed that optimal hyperparameters vary across different models. I would like to share with you another piece of our work titled "Rethinking Out-of-Distribution Detection From a Human-Centric Perspective" (https://arxiv.org/abs/2211.16778). This work indicates that the algorithm's performance is significantly influenced by the model structure and parameters. This observation might explain why optimal hyperparameters differ across various models. It also reveals the current difficulty in constructing a cross-model universal detection algorithm. I hope my response proves helpful. If you have any further questions, please feel free to email me at ee_zhuy@zju.edu.cn or chat with me on WeChat.
Thanks for your reply! It helps me a lot. Indeed, using statistical methods to obtain the mean and variance of features can be more practical. I think perhaps using quantiles instead of mean/variance might be also a good choice. Of course, there might be more suitable statistics. I quite agree with your viewpoint in "Rethinking Out-of-Distribution Detection From a Human-Centric Perspective" that "model architectures and training regimes matter in OOD detection and should be considered integral when designing new detection methods." Perhaps future OOD detection methods require insensitive hyperparameters (if any), or can reveal the relationship between hyperparameters and network architectures/model training.
I agree with you. After realizing the significant impact of the model’s parameters on detection algorithms, I have been recently reflecting on whether there are algorithms that are insensitive to hyperparameters or even delving into how much contribution post-hoc detection methods make to security. Feel free to contact me for further discussion.
It seems that the method doesn't use the information of BN layer in the code of BATS. Maybe I miss something. In addition, I'm curious about the selection of hyperparameter, such as lam = 1.05 for ImageNet.