ali-vilab / Cones-V2

MIT License
502 stars 18 forks source link

Why does the model performs well when I try to control the location of objects with the residual token embedding . #7

Open ouroboros-phy opened 10 months ago

ouroboros-phy commented 10 months ago

Thank you for this amazing work! However, when I try to use the layout guidance method without the residual token embedding, the generated result is not able to put the objects on the location as the guidance. If I enable the residual token embedding method, it works well! Have you find this before? I want to know why is it works well on this case.

Johanan528 commented 10 months ago

Hi, thank you for your interest in our work! In fact, we have conducted relevant experiments, and our method is equally effective when not customizing subjects but only controlling the position of the subject, as shown in the figure below. Please provide a detailed description of your question so that we can better assist you. image

dreamer121121 commented 1 month ago

I also find the same problem ! if you use sd2.1 to generate a picture without using the residual token embedding, it is diffuicult to contorl the location of the object. the example can be seen in the photo 截屏2024-08-20 15 00 22

截屏2024-08-20 15 06 49

截屏2024-08-20 15 00 41

截屏2024-08-20 15 07 22