anomaly localization results are close, but anomaly detection results are different.

zhanjw commented 2 years ago

Using the configuration you have given

./experiments/MVTec-AD/connfig.yaml

the configuration does not contain

  metrics:
    auc:
      - name: mean

I added it and tried to reproduce it

mean AUROC seems a bit low, especially for screw

Are there any tips for training? or anything I should be aware of?

clsname	mean	pixel	max	std
capsule	0.83566	0.98597	0.913841	0.871959
bottle	0.988095	0.97981	1	1
toothbrush	0.888889	0.983198	0.936111	0.972222
screw	0.539045	0.987008	0.913302	0.947325
transistor	0.919167	0.982308	0.99875	0.99625
wood	0.961404	0.930861	0.985965	0.980702
tile	0.994949	0.919262	0.993506	0.997835
hazelnut	0.995357	0.980919	1	0.997857
leather	1	0.987544	1	1
pill	0.832242	0.960712	0.945717	0.875614
grid	0.951546	0.972235	0.988304	0.993317
metal_nut	0.882209	0.932279	0.995112	0.969697
zipper	0.985557	0.974935	0.979254	0.982668
cable	0.923913	0.974855	0.957271	0.958958
carpet	0.886437	0.983814	0.997994	0.998796
mean	0.905631	0.969047	0.973675	0.969547

zhiyuanyou commented 2 years ago

Hi, your results are right.

The mean, max, and std in the first line actually mean post-processing methods. That is to say, the anomaly localization result is an anomaly map with the shape of H x W. We need to convert this map to a scalar as the anomaly score for this whole image. For this convert, you have three options.

use the mean value of the anomaly map
use the max value of the (average pooled) anomaly map
use the std value of the anomaly map

In our paper, we use max for MVTec-AD and use mean for CIFAR-10.

zhiyuanyou commented 2 years ago

Another question, why mean is so poor for screw? This is because: For screw, the area of fore-ground region is too small. Using mean for post-processing takes too many irrelevant back-ground regions into consideration. Thus mean is obviously not a good solution for screw.

zhiyuanyou commented 2 years ago

Therefore, your final results should be 96.9 for localization and 97.4 for detection, even better than our paper (We use 8 GPUs whose results are usually poorer than 1 or 2 GPUs).

Also, we will add the explanation of mean, max, std to README.

18894269590 commented 9 months ago

作者，你好，我们在MVTecAD上利用论文里面的设置，bs=64，backbone选为efficientnet_b4，然而当用mean_max_auc作为key_metric的时候，利用1000个epoch里面保存的ckpt_best.pth.tar，得到的性能指标如下，很多指标都达不到论文的指标，请问这是为什么呀，期盼您的回答。

zhiyuanyou / UniAD

anomaly localization results are close, but anomaly detection results are different. #6