why the AUPRO lower than the WinCLIP?

Due to the presence of overfitting issues, especially when training and testing on different datasets, it is challenging to determine when it is appropriate to stop training our method. Different epochs may yield different results, with some epochs showing higher PRO metrics.

Additionally, I observe that existing metrics have preferences (e.g., AUROC being insensitive to a large number of normal instances predicted as anomalous). Therefore, evaluating the performance of anomaly detection using existing metrics may not be entirely suitable. Designing more appropriate metrics for future studies or thoroughly discussing the preferences of various metrics could be an interesting research direction.

Please note that the above is my personal viewpoint and is provided for reference only. However, I highly welcome mutual exchange and counterarguments. :)

ByChelsea / VAND-APRIL-GAN

why the AUPRO lower than the WinCLIP? #14