Yujun-Shi / DragDiffusion

[CVPR2024, Highlight] Official code for DragDiffusion
https://yujun-shi.github.io/projects/dragdiffusion.html
Apache License 2.0
1.13k stars 82 forks source link

Can not reproduce the results #38

Closed xizaoqu closed 11 months ago

xizaoqu commented 11 months ago

Hi, I tried to the same picture in the demo, but the result is weird. Does anything need to be further configured? image

Yujun-Shi commented 11 months ago

Hello, this problem is probably because the mask you painted is not properly updated due to some bugs in Gradio.

For example, in the normal case, the second column of the image (the one under "click points") should appear like this: image

Every time you start dragging, make sure the second column has the above appearance, where the masked regions are highlighted.

To solve this problem, simply click again in the "Draw Mask" area, and then the mask will be properly updated.

leoShen917 commented 11 months ago

Hi@Yujun-Shi , I tried to edit on this image as well, but you can notice the distortion of the face after dragging it. I would like to ask if this is because the lora used by authors was not trained from this single image, but a lora trained from many anime images. Isn't it difficult for lora trained on a single image to handle this kind of occlusion problem. Looking forward to your reply, thanks! image

Yujun-Shi commented 11 months ago

Oh, I see where the problem is @xizaoqu @leoShen917 . Actually, for this example, we're dragging the diffusion-generated image. We use Counterfeit-V2.5 to produce the results. Details are given as follows: image

Here, the seed is 65536

the positive prompt is: ((masterpiece,best quality)),1girl, bangs, blue eyes, blurry background, branch, brown hair, dappled sunlight, flower, from side, hair flower, hair ornament, japanese clothes, kimono, leaf, (maple leaf:1.9), obi, outdoors, sash, solo, sunlight, upper body

The negative prompt is: (low quality, worst quality:1.4), (bad anatomy), (inaccurate limb:1.2), bad composition, inaccurate eyes, extra digit, fewer digits, (extra arms:1.2), large breasts

In addition, as described in Appendix D in our updated report (https://arxiv.org/pdf/2306.14435.pdf), when dragging diffusion-generated images, we do not have to train any LoRA. Simply generate the image and then drag it.

Yujun-Shi commented 11 months ago

In your case, I think you're treating the image as a real image and use Stable Diffusion 1.5 to deal with it. Since SD1.5 is trained on general dataset, so its ability to handle anime image might not be very good.

xizaoqu commented 11 months ago

Understand, thanks for your reply.