miccunifi / ladi-vton

[ACM MM 2023] - LaDI-VTON: Latent Diffusion Textual-Inversion Enhanced Virtual Try-On
Other
412 stars 56 forks source link

Question about ablation study #46

Open colorful-liyu opened 10 months ago

colorful-liyu commented 10 months ago

Hi, thanks for sharing your great work!

I'm very interested in exploring the application of LDM in virtual try-on and inspired by your work. But I'm confused by the second third row of Tab. 4 in your paper.

I notice the performance doesn't drop obviously with empty strings (row 1) or textual elements (row 2). How can I get textual elements? Maybe pass the garment images directly through VE to U-NET?

Moreover, why does the performance drop dramatically with f_theta? even much worse than empty strings?

Looking forward to your reply! Thank you again!