gligen / GLIGEN

Open-Set Grounded Text-to-Image Generation
MIT License
1.91k stars 145 forks source link

How to combine 2d box and canny edge to control the image generation together? #63

Open 1028686314 opened 8 months ago

1028686314 commented 8 months ago

Thank you for the subsequent updates on more controllable methods, including edge, depth, etc. So fast. But I have a question, when I want to combine 2d box and canny edge to control the image generation together, how to redesign the UNet network structure? For example, roughly stacking two gated self attention layers, one for fusing 2d box embedding, and the other for fusing edge embedding? Any more experience recommendations? I would like to get your answer! Thank you very much!