yisol / IDM-VTON

IDM-VTON : Improving Diffusion Models for Authentic Virtual Try-on in the Wild
https://idm-vton.github.io/
3.1k stars 473 forks source link

Training questions #44

Open nom opened 2 months ago

nom commented 2 months ago

Hey, great work! Quick question on training.

I was wondering how you're fitting two SDXL UNets (garment UNet and tryon UNet) on a single A800 with batch size 24/4=6 (assuming 4xA800 in total). I see you're using FP16 models, but are you doing any optimizations to bring memory down, like precomputing embeddings / features, 8bit adam or gradient accumulation? I'm trying to reproduce training, but can only fit 3 samples at 1024x768 resolution on 80GB VRAM during training and a single step takes ~1.3 seconds on a H100. I'm already doing the above tricks (8bit adam, precomputing VAE embeddings, frozen garment unet).

Also curious about training speed if you can share. Thanks!

yisol commented 2 months ago

Hello, we used gradient checkpointing and 8 bit adam for training and fit batch size 6 to single A100 GPU. We didn't use precomputing latents and embeddings or gradient accumulation but you can use them for reducing memory cost. Training time was around 1~2day on 4xA100 GPU for 63k iterations.

nom commented 2 months ago

Thanks @yisol. Are you perhaps not doing EMA?

Also if you could share a work-in-progress rough train script here, that'd be really helpful - just to get a better understanding of the differences with mine, doesn't have to be a working script.

ifeherva commented 2 months ago

Did you use noise_offset or snr_gamma (=5) during training?

cardosofelipe commented 1 month ago

Thanks @yisol. Are you perhaps not doing EMA?

Also if you could share a work-in-progress rough train script here, that'd be really helpful - just to get a better understanding of the differences with mine, doesn't have to be a working script.

@yisol It would really helpful indeed

Anustup900 commented 1 month ago

Hey @nom I am trying to replicate the training, it would be great if you share a glimpse of your script or an idea also will work.

awzhgw commented 1 month ago

@nom can you share finetune code for me ?

jasonaidm commented 1 month ago

@nom can you share training or finetune code for me ?

thuc248997 commented 1 month ago

@nom can you share the fine-tune code?

awzhgw commented 1 month ago

@nom can you share the fine-tune code?

ttjygbtj22 commented 1 month ago

@nom can you share the fine-tune code?

nftblackmagic commented 3 weeks ago

I made an unofficial training code here. Still testing it. Please try if you like: https://github.com/nftblackmagic/IDM-VTON-training