sjtuplayer / anomalydiffusion

[AAAI 2024] AnomalyDiffusion: Few-Shot Anomaly Image Generation with Diffusion Model
MIT License
114 stars 14 forks source link

Question about the TI and the training samples #24

Open DaiZhewei opened 4 months ago

DaiZhewei commented 4 months ago

Hi author, I would like to ask one question, did you use all the masks in the test set when training the TI or only the first 1/3 of the masks? Looking forward to your answer, thanks!

sjtuplayer commented 4 months ago

Sorry I do not get what 'TI' represents. But for all the training process, we always use the first 1/3 of the data.

DaiZhewei commented 4 months ago

Sorry I do not get what 'TI' represents. But for all the training process, we always use the first 1/3 of the data.

It refers to the Textual Inversion method that you used to generate different masks.

sjtuplayer commented 4 months ago

Sorry I do not get what 'TI' represents. But for all the training process, we always use the first 1/3 of the data.

It refers to the Textual Inversion method that you used to generate different masks.

If so, we only use the first 1/3 of the masks.

DaiZhewei commented 4 months ago

Thanks, I have one more question, the image of the training sample given in the paper is the 7th corresponding to the wood_color class (there are 8 samples in this class) and does not belong to the first 1/3 of the data mentioned

9527ae100ebf579f28660ae02e605ae
sjtuplayer commented 4 months ago

This experiment is not the final version, where we train the models by randomly selecting 1/3 of the data. Since it is only for qualitative comparison, we keep it to show the comparison results. Sorry for bringing such confusion to you.

engrmusawarali commented 4 months ago

Dear Hu Teng,

I observed there is red color on the generated images in MetalNuts, However there is not red color in the training set. Could you please clarify this issue?

sjtuplayer commented 4 months ago

Dear Hu Teng,

I observed there is red color on the generated images in MetalNuts, However there is not red color in the training set. Could you please clarify this issue?

For this extremely biased dataset, we use crop-paste to crop the red color from 012.png and paste it on 000.png with the mask of 000.png manually. Since this is not a common case, we do not mention it in the paper and code. And actually, the generated red color does not improve the anomaly detection accuracy obviously.

engrmusawarali commented 4 months ago

Dear Hu Teng,

Thank you so much for your kindness. For How many classes you did this? Could you please clarify? I observed similar things in carpet and wood.

DaiZhewei commented 4 months ago

Hi, thank you very much for your answer. I have three more questions:

  1. for object class defects, how to make sure that the randomly generated mask is in the corresponding position of the object, not in an unreasonable area or background area. I have read about mask filtering operation in other issues, and I would like to ask how to filter it, because the object area is different in each image.
  2. for the logic class defects (cable swap, metal_nut flip and transistor misplaced), is there any special treatment done and how is the reasoning process implemented.
  3. we downloaded the provided segmentation model for unet weights and found that unet handles the segmentation of logic class defects very well, would like to ask how this is achieved?
sjtuplayer commented 4 months ago

Hi, thank you very much for your answer. I have three more questions:

  1. for object class defects, how to make sure that the randomly generated mask is in the corresponding position of the object, not in an unreasonable area or background area. I have read about mask filtering operation in other issues, and I would like to ask how to filter it, because the object area is different in each image.
  2. for the logic class defects (cable swap, metal_nut flip and transistor misplaced), is there any special treatment done and how is the reasoning process implemented.
  3. we downloaded the provided segmentation model for unet weights and found that unet handles the segmentation of logic class defects very well, would like to ask how this is achieved?
  1. There is a problem where sometimes the anomaly mask is outside the object. But we do not filter out them in our experiments since some of the anomaly types have anomalies outside the object while some do not. But you can try using object segmentation model to filter out some of the unreasonable masks.
  2. We do not treat the logic class defects specially.
  3. I think that for metal_nut flip, it can also be regarded as a type of different texture. And for transistor misplaced, it mainly relies on the mask generation method, since it is strongly related to the position. For cable swap, I find out that our generated data has some wrongly generated results. Therefore, even though there is not a special design for logic anomaly, for most of the situations(metal_nut flip and transistor misplaced), our model still performs well. But for some cases like cable swap, It do not performs so well.
sjtuplayer commented 4 months ago

Dear Hu Teng,

Thank you so much for your kindness. For How many classes you did this? Could you please clarify? I observed similar things in carpet and wood.

I need to recount it carefully. But I'm quite so busy before ACM MM 2024, I will rely you once I have enough free time or after ACM MM. Thanks for your patience.

engrmusawarali commented 4 months ago

Thank you @sjtuplayer, for your kind response.