sjtuplayer / anomalydiffusion

[AAAI 2024] AnomalyDiffusion: Few-Shot Anomaly Image Generation with Diffusion Model
MIT License
114 stars 14 forks source link

Confusion about the normal region of the generated image #44

Open xrli-U opened 2 months ago

xrli-U commented 2 months ago

"while the remaining region is consistent with the input anomaly-free sample" is mentioned in the first paragraph of the Method section of the paper, but in the generated images you provided, the normal region is not consistent with the anomaly-free samples. For example, for the white background part of the bottle, all the background part of anomaly-free samples are white (255, 255, 255), while the generated images you provide are not. In the capsule, the logo(actavis) of the generated image is also blurry and distorted than the real one, just like the generated one.

Could you answer my question? Thank you very much!

Musawar71 commented 2 months ago

Dear @xrli-U, actually stable diffusion encodes the images using the stable diffusion encoder, that's the lossy compression, secondly, there may be small changes due to the blended diffusion. But overall, in this process the background looks same, however, it is not totally preserved and shall have the difference with original background in the terms of numbers as you mentioned. that's the reason you see actavis is blurred. They say consistent, not the same. I am not author of paper. I just like to participate in discussion.

xrli-U commented 2 months ago

In the final process of generation, if mask is used to directly replace the normal region with the original anomaly-free image, the authenticity may be improved. Is that right?

Musawar71 commented 2 months ago

Yes this may be a good approach, but this can not be done with latent diffusion models.

boxbox2 commented 2 weeks ago

In the final process of generation, if mask is used to directly replace the normal region with the original anomaly-free image, the authenticity may be improved. Is that right?

hi,你把这个想法实现了吗? (if mask is used to directly replace the normal region with the original anomaly-free image)。我将他生成的mask部分的img区域重新粘贴会ori,发现效果很差。我怀疑它是否只是简单将mask合成到mvtec中good图像内,并没有用到diffusion,我还没找到代码中将mask与good合成的部分。 1 模型生成如上图 2 我将mask对应img的部分区域(即上图取mask区域将其贴到ori上) 3 模型生成的mask 4 上图是ori 可以发现效果并不好 可以说如果模型对于这种缺陷不识别的话,那么生成的img(anomaly)也无法通过repalce region的方式来增强对比。

import os
import cv2
import numpy as np

# 设置文件夹路径
image_dir = '/data/generated_dataset/carpet/thread/image'
mask_dir = '/data/generated_dataset/carpet/thread/mask'
ori_dir = '/data/generated_dataset/carpet/thread/ori'
output_dir = './output/thread'

# 确保输出文件夹存在
os.makedirs(output_dir, exist_ok=True)

# 获取所有文件名
image_files = os.listdir(image_dir)

for image_file in image_files:
    # 构建文件路径
    image_path = os.path.join(image_dir, image_file)
    mask_path = os.path.join(mask_dir, image_file)
    ori_path = os.path.join(ori_dir, image_file)
    output_path = os.path.join(output_dir, image_file)

    # 读取图像和掩码
    image = cv2.imread(image_path)
    mask = cv2.imread(mask_path, cv2.IMREAD_GRAYSCALE)  # 读取为灰度图
    ori = cv2.imread(ori_path)

    # 确保掩码和原始图像尺寸一致
    if mask.shape[:2] != ori.shape[:2]:
        print(f"Mask and original image size mismatch for {image_file}. Skipping.")
        continue

    # 获取掩码中不为零的区域
    masked_region = np.where(mask != 0)

    # 裁剪 image 中掩码对应部分
    crop = image[masked_region]

    # 将裁剪部分粘贴到 ori 对应位置
    ori[masked_region] = crop

    # 保存结果
    cv2.imwrite(output_path, ori)

print("操作完成!")

这个是我对这个想法的简单实现。也许您提到的方法并不是这个意思,我们可以多讨论讨论如何解决这个小缺陷不明显的问题。

xrli-U commented 2 weeks ago

In the final process of generation, if mask is used to directly replace the normal region with the original anomaly-free image, the authenticity may be improved. Is that right?

hi,你把这个想法实现了吗? (if mask is used to directly replace the normal region with the original anomaly-free image)。我将他生成的mask部分的img区域重新粘贴会ori,发现效果很差。我怀疑它是否只是简单将mask合成到mvtec中good图像内,并没有用到diffusion,我还没找到代码中将mask与good合成的部分。 1 模型生成如上图 2 我将mask对应img的部分区域(即上图取mask区域将其贴到ori上) 3 模型生成的mask 4 上图是ori 可以发现效果并不好 可以说如果模型对于这种缺陷不识别的话,那么生成的img(anomaly)也无法通过repalce region的方式来增强对比。

import os
import cv2
import numpy as np

# 设置文件夹路径
image_dir = '/data/generated_dataset/carpet/thread/image'
mask_dir = '/data/generated_dataset/carpet/thread/mask'
ori_dir = '/data/generated_dataset/carpet/thread/ori'
output_dir = './output/thread'

# 确保输出文件夹存在
os.makedirs(output_dir, exist_ok=True)

# 获取所有文件名
image_files = os.listdir(image_dir)

for image_file in image_files:
    # 构建文件路径
    image_path = os.path.join(image_dir, image_file)
    mask_path = os.path.join(mask_dir, image_file)
    ori_path = os.path.join(ori_dir, image_file)
    output_path = os.path.join(output_dir, image_file)

    # 读取图像和掩码
    image = cv2.imread(image_path)
    mask = cv2.imread(mask_path, cv2.IMREAD_GRAYSCALE)  # 读取为灰度图
    ori = cv2.imread(ori_path)

    # 确保掩码和原始图像尺寸一致
    if mask.shape[:2] != ori.shape[:2]:
        print(f"Mask and original image size mismatch for {image_file}. Skipping.")
        continue

    # 获取掩码中不为零的区域
    masked_region = np.where(mask != 0)

    # 裁剪 image 中掩码对应部分
    crop = image[masked_region]

    # 将裁剪部分粘贴到 ori 对应位置
    ori[masked_region] = crop

    # 保存结果
    cv2.imwrite(output_path, ori)

print("操作完成!")

这个是我对这个想法的简单实现。也许您提到的方法并不是这个意思,我们可以多讨论讨论如何解决这个小缺陷不明显的问题。

我后面没有再试了,我个人不太习惯使用这个程序,所以我们课题组自己在研究新的