Acly / krita-ai-diffusion

Streamlined interface for generating images with AI in Krita. Inpaint and outpaint with optional text prompt, no tweaking required.
https://www.interstice.cloud
GNU General Public License v3.0
6.46k stars 312 forks source link

[Enhancement | Feature request] PowerPaint pipeline integration #221

Closed Pirog17000 closed 7 months ago

Pirog17000 commented 9 months ago

https://github.com/open-mmlab/mmagic/tree/main/projects/powerpaint One link worth thousands words. Better inpainting, better outpainting, stability of the image, consistency of the image

https://powerpaint.github.io/

Acly commented 9 months ago

Uh huh, I saw and quickly run my previous test set through their online demo.

Test Input PowerPaint CN+IP (current)
Tori base+mask download ComfyUI_00023_
Illustration base+mask download ComfyUI_00041_
Bruges base+mask download ComfyUI_00016_

Let's just say I'm not convinced so far. Maybe I'm doing something not optimally.

Pirog17000 commented 9 months ago

well, there's no CN for inpaint for SDXL models at the moment. Possibly, a solution: https://github.com/showlab/X-Adapter but would be great to have it as an option during the setup of the plugin modules, as an extra option, or one of the options for outpaint/inpaint

zengyh1900 commented 9 months ago

There are multiple options for inpainting in PowerPaint, such as text-guided object inpainting(requires text input and inserting objects into images) and text-free object removal (no need for text input). That means if we want to remove something we should use the option for object removal (for sure, it is not always perfect)

I tested on the online demo and got some results without cherry-picking.

with this input,
289913904-fa654a8b-b18c-4f2e-97c0-3eea97f850e1

I have: image (1)

Then, I extend it in the horizontal direction, image (5)

Or in the vertical expansion ration, image (4)

We are still tuning a stonger and more stable version.

zengyh1900 commented 9 months ago

Is the inpainting model for SDXL important? We did plan to develop an inpainting model for SDXL. However, we pend it for a while due to some other reasons.

Pirog17000 commented 9 months ago

I personally prefer more SDXL over 1.5 models - more understanding of the prompts, higher resolution, less repetition when canvas goes out of focus window (for 1.5 it was about 768x768px, sdxl supports up to 1680x1680 sometimes), so if possible - yes, that will be fantastic @zengyh1900

Acly commented 9 months ago

Thanks for your input, it certainly looks better than my inital try. I tried multiple modes, with and without prompt. The results I compare against were all done without any prompt or user input (only image + mask).

The paper mentions "Context-aware Inpainting" and "Object Removal" as separate mode, but I cannot find "Context-aware" in the online demo?

Is the inpainting model for SDXL important?

ControlNet Inpaint has the advantage that you can use a specialized SD model which fits the image content for the task. General inpaint solutions on the other hand may be good for eg. photos, but not as strong in other areas. So if your use case is to inpaint photos, I don't think SDXL has a big advantage - its improvements are mainly relevant for whole image generation, and it seems less adaptable to other tasks then SD1.5.

However another use case is generating artwork that follows a certain style, and then using inpainting steps to edit and refine the image. Here it really helps if the model used for inpainting matches the one to generate the inital image. So if certain SDXL checkpoint was used, to use it also for inpaint/outpaint in order to easily match the chosen style. This is an area where SD offers much more flexibility than eg. Adobe solution.

That being said, I'm open to investigate alternatives. One idea is to use a model trained for inpaint (such as LaMA or PowerPaint) to generate initial content in the masked area, and then use any non-inpaint SD(XL) model as a second low-denoise img2img pass.

Pirog17000 commented 9 months ago

use a model trained for inpaint (such as LaMA or PowerPaint) to generate initial content in the masked area, and then use any non-inpaint SD(XL) model as a second low-denoise img2img pass.

this is my main approach. But at the moment I generally just paint key colors and tones for proper inpaint. sometimes it's very tedious, but will works. If there's a ready-to-go solution like the model specifically made for such purposes - that will be great to have it onboard

Pirog17000 commented 9 months ago

@Acly there's one more approach has arrived:

https://github.com/damo-vilab/AnyDoor

zhuang2002 commented 9 months ago

PowerPaint Update

We've released an updated model with enhanced stability, now available for interactive testing on our online demo.

Paper: https://arxiv.org/abs/2312.03594 HomePage: https://powerpaint.github.io/ Online Demo: https://openxlab.org.cn/apps/detail/rangoliu/PowerPaint Code: https://github.com/open-mmlab/mmagic/tree/main/projects/powerpaint

Pirog17000 commented 9 months ago

@Acly Inpaint controlNet model for SDXL has arrived: https://civitai.com/models/136070?modelVersionId=271562, can this be included please?

Acly commented 9 months ago

It's... interesting. Actually does a pretty good job at putting content into the masked area, but drastically changes the color of the entire image. So the inpaint result no longer fits into the original image.

The examples on the model page show this, and the description says

Depending on the prompts, the rest of the image might be kept as is or modified more or less.

... but I have no idea how to "keep it as is". Anyway it's an alpha development version, not sure how you found this, but it doesn't look like it's officially released.

Acly commented 7 months ago

Inpaint for SDXL is implemented now in v1.14.0 - there are also many other improvements. It's a solution that works with existing and custom SD checkpoints, which ends up being more flexible and introduces fewer additional models for this project.