tzjtatata / Myriad

Open-sourced codes, IAD vision-language datasets and pre-trained checkpoints for Myriad.
50 stars 1 forks source link

Inquiry on Addressing Performance Issues in Zero-Shot Settings #7

Open yjtlab opened 6 months ago

yjtlab commented 6 months ago

Thanks for your work in anomaly detection domain. I am reaching out to discuss an aspect of your work that caught my attention, specifically regarding the experiments conducted in a zero-shot setting.

My question centers around how you addressed the potential increase in anomaly scores for normal samples when transferring the model trained on one dataset (e.g., MVTec) to perform zero-shot anomaly detection on a different dataset (e.g., VisA). Commonly, such a transition might result in higher anomaly scores for normal samples in the new dataset, potentially leading to an increased false positive rate.

Could you elaborate on the strategies or methodologies employed in your work to mitigate this issue? Thank you for your time and consideration. I look forward to your insights on this issue.

tzjtatata commented 6 months ago

Thanks for your attention. In my opinion, only the zero-shot anomaly detector is trained on one dataset and tested on another dataset. The anomaly detectors are designed for handling zero-shot setting which train set and test set are naturally different: the anomaly scores might be increasing due to the gap between training and testing sets. We always use the powerful pre-trained models as backbone to resolve this problem. For example, AprilGAN and WinCLIP use CLIP, a popular brilliant pre-trained V-L model, while we use MiniGPT-4 which is designed for open-set/grounding usage. Further, Myriad can utilize the zero-shot vision experts like AprilGAN and WinCLIP to provide a prior and help the vision encoder give more attention for the potential defect area. With vision expert guidance visual features, MiniGPT-4 can figure out whether there are unusual parts in the image, thanks to the large-scale language and multimodal pre-training.

yjtlab commented 6 months ago

Thank you for your response. I understand the design of your network architecture. My confusion lies in the increase in false positive rates due to distribution shifts in zero-shot scenarios. For example, an AprilGAN trained on the MVtec dataset and then transferred to the Visa dataset results in higher anomaly scores, consequently increasing the false positive rate. I was wondering if you might have any tricks or suggestions to mitigate this issue?

2024-04-09 10:01:31 "CatKled" @.***> 写道:

Thanks for your attention. In my opinion, only the zero-shot anomaly detector is trained on one dataset and tested on another dataset. The anomaly detectors are designed for handling zero-shot setting which train set and test set are naturally different: the anomaly scores might be increasing due to the gap between training and testing sets. We always use the powerful pre-trained models as backbone to resolve this problem. For example, AprilGAN and WinCLIP use CLIP, a popular brilliant pre-trained V-L model, while we use MiniGPT-4 which is designed for open-set/grounding usage. Further, Myriad can utilize the zero-shot vision experts like AprilGAN and WinCLIP to provide a prior and help the vision encoder give more attention for the potential defect area. With vision expert guidance visual features, MiniGPT-4 can figure out whether there are unusual parts in the image, thanks to the large-scale language and multimodal pre-training.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

tzjtatata commented 6 months ago

Zero-shot Anomaly detectors are designed for fight against the distribution shifts. For example, AprilGAN is trained on 12/15 different products and there is already distribution shifts(or domain shifts, on VisA or MVTec-AD respectively). The training strategy forces the zero-shot anomaly detector to counter the distribution shift. 

So I think the increase in false positive rates might be caused by other reasons. For example, the linear projection or the objective of AprilGAN is not sufficient to learn the general knowledge between domain shift in training sets. For Myriad, we observe that our proposed lora feature adaptor is helpful for one-class setting but hurt zero/few-shot performance slightly. 

------------------ 原始邮件 ------------------ 发件人: "tzjtatata/Myriad" @.>; 发送时间: 2024年4月9日(星期二) 上午10:53 @.>; @.**@.>; 主题: Re: [tzjtatata/Myriad] Inquiry on Addressing Performance Issues in Zero-Shot Settings (Issue #7)

Thank you for your response. I understand the design of your network architecture. My confusion lies in the increase in false positive rates due to distribution shifts in zero-shot scenarios. For example, an AprilGAN trained on the MVtec dataset and then transferred to the Visa dataset results in higher anomaly scores, consequently increasing the false positive rate. I was wondering if you might have any tricks or suggestions to mitigate this issue?

2024-04-09 10:01:31 "CatKled" @.***> 写道:

Thanks for your attention. In my opinion, only the zero-shot anomaly detector is trained on one dataset and tested on another dataset. The anomaly detectors are designed for handling zero-shot setting which train set and test set are naturally different: the anomaly scores might be increasing due to the gap between training and testing sets. We always use the powerful pre-trained models as backbone to resolve this problem. For example, AprilGAN and WinCLIP use CLIP, a popular brilliant pre-trained V-L model, while we use MiniGPT-4 which is designed for open-set/grounding usage. Further, Myriad can utilize the zero-shot vision experts like AprilGAN and WinCLIP to provide a prior and help the vision encoder give more attention for the potential defect area. With vision expert guidance visual features, MiniGPT-4 can figure out whether there are unusual parts in the image, thanks to the large-scale language and multimodal pre-training.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.> — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.>