DonaldRR / SimpleNet

MIT License
402 stars 59 forks source link

Really not good on the PRO curve. #49

Closed zzhml closed 6 months ago

zzhml commented 8 months ago

I reproduced the image AUROC of 0.972, the full pixel AUROC of 0.989, and the anomaly pixel AUROC of 0.938. I plotted the PR curve and the confusion matrices obtained using the thresholds 0.07937 and 0.619 are as follows: image confusion matrix = [[18, 5], [1, 108]], pixel confusion matrix = [[9707368, 1110670], [225, 130408]]. Here, the image_scores and pixel_scores were normalized using the maximum/minimum values of the validation dataset. There are comparatively more false positives at the pixel level. When the recall is close to 0.9, the precision at the pixel level is only 0.1.

I have modified the SimpleNet._evaluate method to plot a PRO curve. It is not very good, and even worse compared to the PRO plot of the original MVTec Dataset paper variation model.(https://link.springer.com/article/10.1007/s11263-020-01400-4)

The threshold estimation method is dependent on the validation dataset, which is not possible in most real world applications. This is a common problem for ad tasks.

DonaldRR commented 6 months ago

Thanks for reminding & contributing, I'm sorry missing that metric for reporting. There's a lot more work to do. PRO score sounds need more structural semantics, while this work use local patch for discriminating.