HuiZhang0812 / DiffusionAD

148 stars 16 forks source link

Does the model have to be trained on all classes in the entire dataset? #48

Closed boxbox2 closed 5 months ago

boxbox2 commented 5 months ago

Because training on the entire MVTEC dataset is too time-consuming, I only trained the 'carpet' category. This is my args.json, batchsize = 4,epochs = 1500,Different from the author's batchsize = 16 and epoch = 3000

{
  "img_size": [256,256],
  "Batch_Size": 4,
  "EPOCHS": 1500,
  "T": 1000,
  "base_channels": 128,
  "beta_schedule": "linear",
  "loss_type": "l2",
  "diffusion_lr": 1e-4,
  "seg_lr": 1e-5,
  "random_slice": true,
  "weight_decay": 0.0,
  "save_imgs":true,
  "save_vids":false, 
  "dropout":0,
  "attention_resolutions":"32,16,8",
  "num_heads":4,
  "num_head_channels":-1,
  "noise_fn":"gauss",
  "channels":3,
  "mvtec_root_path":"/workspace/DiffusionAD/MVTec-AD",
  "visa_root_path":"/workspace/DiffusionAD/VisA",
  "dagm_root_path":"/workspace/DiffusionAD/dagm",
  "mpdd_root_path":"/workspace/DiffusionAD/mpdd",
  "anomaly_source_path":"/workspace/DiffusionAD/dtd",
  "noisier_t_range":600,
  "less_t_range":300,
  "condition_w":1,
  "eval_normal_t":200,
  "eval_noisier_t":400,
  "output_path":"/workspace/DiffusionAD/outputs"

}

after 1500,train loss is 3.232 trans_loss

when i run eval.py,it will say missing key image so i changed the eval.py,(line 381)

        unet_model.load_state_dict(output["unet_model_state_dict"],strict=False)
        unet_model.to(device)

after this image It can be found that out_mask has better results, but recon_con is still a Gaussian image.

Isn’t this training a model per category? Why can Segmentation Sub-network training be successful, but Norm-guided One-step Denoising does not have good results?Or is it because I didn’t train enough rounds? Or must all categories be trained to avoid possible missing keys problems?

Looking forward to your reply

boxbox2 commented 5 months ago

One more thing, because my single card only 16g so I changed to multiple card to run unet_model,but the seg_model didn't. image Maybe it doesn't support multi-card running?I don't know. Has anyone encountered and solved this problem?

boxbox2 commented 5 months ago

I solved it. image in the past,When I selected multi-card when training, but eval.py did not write multi-card function, the problem of missing key will occur. I chose to exercise strict = false at 385. But something went wrong. I solved the problem by adding 381 lines in eval.py image