Open Ringhu opened 7 months ago
Due to the presence of overfitting issues, especially when training and testing on different datasets, it is challenging to determine when it is appropriate to stop training our method. Different epochs may yield different results, with some epochs showing higher PRO metrics.
Additionally, I observe that existing metrics have preferences (e.g., AUROC being insensitive to a large number of normal instances predicted as anomalous). Therefore, evaluating the performance of anomaly detection using existing metrics may not be entirely suitable. Designing more appropriate metrics for future studies or thoroughly discussing the preferences of various metrics could be an interesting research direction.
Please note that the above is my personal viewpoint and is provided for reference only. However, I highly welcome mutual exchange and counterarguments. :)
Thank you for your working first. I found that both your AUROC and F1max score on mvtec-ad dataset for zero-shot segmentation are higher than the WinCLIP, but the AUPRO is lower (64.6 for WinCLIP and 44 for your work), can you provide some explanation for it? Thank you.