The problem about the evaluation

hq-deng / RD4AD

Anomaly Detection via Reverse Distillation from One-Class Embedding

MIT License

174 stars 31 forks source link

Thank you for sharing!

Your clear and compact declaration inspired me a lot. But I get some issues assessing the validity of the evaluation. As we know, RD4AD was evaluated on the widely used MVTecAD dataset and has demonstrated amazing performance. Since MVTecAD contains only the training and test set, the paper has followed its dataset setup and does not mention the concept of the validation set. However, I am still confused about this:

Because there is no validation set, the final results have been generated with a model known to be performing well on the test set. Is it reasonable to generate the results in this way? Is it somehow trapped in a circular analysis?
The state-of-the-art papers in the field of image anomaly detection use the evaluation setup without a validation set, whether using publicly available datasets (e.g. MVTecAD) or self-built datasets. In your opinion, how should this phenomenon be explained? Does this mean that a fixed paradigm has been formed in the field of image anomaly detection?

Thank you very much for your kind reading!

Hello,

In my opinion, the "Anomaly" defined in anomaly detection should be unknown, so we shouldn't validate the algorithm on real anomaly dataset. However, we know that it's necessary to find a validation procedure to tune the hyper-parameters. Recently, I find some studies related to this: (1) We can generate pseudo anomaly dataset to tune the model. http://medicalood.dkfz.de/web/ (2) As the number of iterations increases the model should converge to the optimum, so we may not need to use the validation set to find the best epoch. https://github.com/zhiyuanyou/UniAD

I think the study area of anomaly detection is still developing and recent studies are aiming to solve this problem.

hq-deng / RD4AD

The problem about the evaluation #18