Question about CutPaste paper: Does CutPaste need to train two models for defect detection and localization?

ghost commented 2 years ago

According to the paper in section 4.2, it said

We further evaluate the pixel-wise localization AUC, achieving 88.3.

and

We report a localization AUC in Table 2. Our patch-based model achieves 96.0 AUC.

Also in section 2.4,

In Section 4.2, we visualize a heatmap using patch-level detector for defect localization, along with that of an image-level detector using visual explanation techniques such as GradCAM.

And the training objectives defined in formulas (1) and (3) are different.

Does CutPaste need to train two models for defect detection and localization?

It seems that they trained an image-level detector (3-way), the detection result achieves 95.2 while localization result only achieves 88.3. And they need to train another model using patches as input to get localization result of 96.0. Please correct me if I was wrong.

Runinho commented 2 years ago

Hi, They use two different methods for localization. GradCam and Patch Heatmap. The GradCam method does not require retraining and achieves 88.3 AUC. The Patch Heatmap scores 96.0 and requires training on 64×64 patches as described in Section 2.4:

[...] we learn a representation of an image patch us- ing CutPaste prediction, as in Section 2.4. We train mod- els of 64×64 patches from 256×256 image.

Does CutPaste need to train two models for defect detection and localization?

Depends on the localization method you want to use. With GradCam no. With the patch based method yes.

Another repo also implements these localization and might be interesting to you: https://github.com/LilitYolyan/CutPaste

Hope this helps your understanding

ghost commented 2 years ago

Thank you for your prompt reply and detailed explanation! LilitYolyan they implemented the localization part in the dataset.py, thanks for sharing!

Runinho / pytorch-cutpaste

Question about CutPaste paper: Does CutPaste need to train two models for defect detection and localization? #23