yisol / IDM-VTON

IDM-VTON : Improving Diffusion Models for Authentic Virtual Try-on in the Wild
https://idm-vton.github.io/
3.04k stars 461 forks source link

Questions about diffusers version, garment unet details #72

Open PkuDavidGuan opened 1 month ago

PkuDavidGuan commented 1 month ago

Nice work. I read your code in detail and try to reproduce the training code, but I have a few questions:

  1. What's your diffusers version, I tried 0.24.0 and 0.27.2, neither can run directly.
  2. I see you comment out the added_cond_kwargs in tryon_pipeline.py L1783 , why don't you use added_cond_kwargs in garment_unet ?
  3. Is it necessary to run garment_net multi-times to get reference_features in L1784 ? I think it is ok to only use the reference_features of timestep=0 in all denoising steps like in Moore-AnimateAnyone, which could save lots of times.
  4. Why keep garment_unet freezed in training? In other papers like Magic-Animate and Animate-Anyone, they all have similar reference net, and both reference nets are trained.
PkuDavidGuan commented 1 month ago

Another question is about the customization. In your appendix A.2, you mentioned the model is finetuned 100 steps for customization. I am curious whether 100 steps is enough for a general model to get good results for all clothes in the wild dataset. Do you mean you train one customization model for each cloth ?

thuc248997 commented 1 month ago

@PkuDavidGuan Can you share training code for me?