open-mmlab / PowerPaint

[ECCV 2024] PowerPaint, a versatile image inpainting model that supports text-guided object inpainting, object removal, image outpainting and shape-guided object inpainting with only a single model. 一个高质量多功能的图像修补模型,可以同时支持插入物体、移除物体、图像扩展、形状可控的物体生成,只需要一个模型
https://powerpaint.github.io/
MIT License
364 stars 18 forks source link

Some questions about Prompt #44

Open WenmuZhou opened 1 month ago

WenmuZhou commented 1 month ago

Hello, I have a few questions about the prompt words in the code and paper. I hope you can help me solve them.

  1. text-guided

The prompt for the text-guided task in the paper should only be P_obj, ! [image] (https://github.com/open-mmlab/PowerPaint/assets/12406017/5c399d7f-94f2-4e24-a0fc-2cb18805c3df)

But in the code, it does add P_obj in the negative_prompt, why is this?

https://github.com/open-mmlab/PowerPaint/blob/main/gradio_PowerPaint_BrushNet.py#L101-L104

  1. shake-guided

There is a fitting_degree parameter in the code to adjust the weight between shape and text, but in Code, promptA, promptB and negative_promptA, negative_promptB are set to the same value, and, when performing weight addition, the code is as follows

prompt_embeds = prompt_embedsA * (t) + (1 - t) * prompt_embedsB #https://github.com/open-mmlab/PowerPaint/blob/main/pipeline/pipeline_PowerPaint_Brushnet_CA.py#L345

negative_prompt_embeds = negative_prompt_embedsA [0] * (t_nag) + (1 - t_nag) * negative_prompt_embedsB [0] #https://github.com/open-mmlab/PowerPaint/blob/main/pipeline/pipeline_PowerPaint_Brushnet_CA.py#L424

value of t and t_nag incoming value are the same, in my opinion, when the value of t increases, the weight of the shape in prompt becomes larger, and the weight of the shape in negative_prompt should become smaller, that is, t_neg = 1-t is correct. Is this written wrong in the code or I don't understand it?

Hope to get your answer.