Questions bout base model and prompt for PowerPaintV2

dydxdt commented 2 months ago

Great work! And I have some questions about PowerPaintV2. 1) I see the base model is RealisticVision(simplified as RV) in the gradio_PowerPaint_BrushNet.py and in the BrushNet repo the author offers the RV model, too. So does PowerPaintV2 need retraining again? or just merge the BlushNet with the origin PowerPaint is enough for inference? 2) The prompts used in PP v1 and PP v2 inference scipt are different. Why they are different? For example, in V1 it has something like ', worst quality, low quality, normal quality, bad quality, blurry P_shape', while V2 doesn't. And prompt like 'empty scene' is a kind of custom design? 3) For training scripts, can you give some advice for reference? Maybe I want to train by myself. Thx very much!

sherlhw commented 2 months ago

+1

lijiaxing0213 commented 2 months ago

PowerPaintV2 combined with BrushNet needs training, as the introduction of BrushNet has changed the original distribution of latent features. Furthermore, the main idea of PowerPaint is to optimize the embedding of the task prompt. You can refer to the implementation of the EmbeddingLayerWithFixes class at https://mmagic.readthedocs.io/en/latest/autoapi/mmagic/models/editors/disco_diffusion/clip_wrapper/index.html#mmagic.models.editors.disco_diffusion.clip_wrapper.EmbeddingLayerWithFixes. Pay attention to the methods add_tokens() and add_embeddings().

sherlhw commented 2 months ago

Thanks for your reply!

open-mmlab / PowerPaint

Questions bout base model and prompt for PowerPaintV2 #34