miccunifi / ladi-vton

[ACM MM 2023] - LaDI-VTON: Latent Diffusion Textual-Inversion Enhanced Virtual Try-On
Other
412 stars 56 forks source link

Issue with training VTO & Inversion Adapter #39

Open bertinma opened 11 months ago

bertinma commented 11 months ago

Hi,

I'm trying to train all the model with 1024x768 images. I performed to train TPS & EMASC using this shape with some code modifications. Training works well according to metrics and visuals results.

But, it doesn't work at all for Inversion adapter and VTO. Both training produce no loss reduction during training (close to constant using hard smoothing on wandb and very oscillating without smoothing).

Screenshot 2023-09-18 at 18 15 22

I tested also using 512x384 shape and it gives me the same results. Is it an expected result ?

I'm using default parameters except batch_size = 8 for VTO and batch_size=1 for Inversion adapter on a single A100 GPU. I assume that a greater value than 1 could prevent this training issue, but my HW doesn't allow to use a bigger one 😞 I tried to reduce learning rate but it results to the same issue.

Commands used to train Inversion adapter and VTO :

Could you please, help me to resolve this pb ? Thanks for your clean work btw :)

ABaldrati commented 11 months ago

Hi @bertinma,

Thank you for your interest in our work!

Regarding experiments at a resolution of 1024x768, I must admit that we did not test the training process procedure at that specific resolution, so I may not be able to provide you with precise guidance in that regard.

At a resolution of 512x384, although the loss behavior appeared to be similar, we did notice improvements in the metrics as the training progressed. Did you notice the same improvements in the metrics during training??

Regarding the batch size, did you try to set the --gradient_accumulation_step parameter such that batch_size * gradient_accumulation_parameters is equal to the desired batch size?

Another point: if you want to achieve optimal performance you need to use the flag --train_inversion_adapter during VTO training.

If you have any more questions or need further insights, please feel free to ask.

Best regards, Alberto