miccunifi / ladi-vton

[ACM MM 2023] - LaDI-VTON: Latent Diffusion Textual-Inversion Enhanced Virtual Try-On
Other
412 stars 56 forks source link

Poor results #9

Closed ml1652 closed 1 year ago

ml1652 commented 1 year ago

Hi , I've been attempting to replicate the results you demonstrated in Figure 7 of your paper. However, the outcome is not as presented in your paper. Specifically, the pattern on the T-shirt is not being reproduced. image Here's the same garment I found in zalando-hd 00579_00 Here is the result when I run your code. 00654_00_00579_00 Could you possibly assist or provide any guidance to address this issue? Thanks in advance.

ml1652 commented 1 year ago

More failure results test with same garment.

09569_00_00579_00 11085_00_00579_00 12345_00_00579_00 12419_00_00579_00 12562_00_00579_00

ABaldrati commented 1 year ago

Hi @ml1652

As depicted in the caption the figure you reported (Figure 7) is an ablation study on the Stable Diffusion VAE and EMASC contribution. The images in the figure are compressed and decompressed using the Stable Diffusion VAE without using the denoising network.

Alberto

trituenhantao commented 1 year ago

Hi @ml1652

As depicted in the caption the figure you reported (Figure 7) is an ablation study on the Stable Diffusion VAE and EMASC contribution. The images in the figure are compressed and decompressed using the Stable Diffusion VAE without using the denoising network.

Alberto

Hi @ABaldrati how to using the denoising network.

TalhaUusuf commented 11 months ago

Also from the paper as the mask is inverted (cloth area black , rest white) so the EMASC module learns to reconstruct the area other than cloth (white area) i.e. hands, arms, face so it seems confused to think EMASC will have effect on the cloth pattern rendered. From the paper and pipeline code seemingly warping module has the greatest on cloth patterns.

image