Bing-su / adetailer

Auto detecting, masking and inpainting with detection model.
GNU Affero General Public License v3.0
4.11k stars 318 forks source link

[Feature Request]: #367

Closed Ubermeisters closed 10 months ago

Ubermeisters commented 11 months ago

Is your feature request related to a problem? Please describe.

When running IMG2IMG inpainting, "whole image" mode rather than "inpainted only" mode, ADetailer, when enabled, will spend a lot of time processing detected features outside of the inpainted area, which ultimately are always completely lost processing cycles, ans the are outside the inpainting mask is thrown away from final output. Proposed workflow

Describe the solution you'd like

Go to IMG2IMG: Inpainting tab.

Set Inpaint mode to "whole picture" rather than "only masked"

Enable Adetailer Processing in default mode

OPTION A: Set additional toggle for Adetailer: "whole picture" and "only masked"

Run Image Generation

Describe alternatives you've considered

OPTION B: Set additional toggle for Adetailer to "follow inpainting mode" or "opposite of inpainting mode" (aka if inpainting is set to "whole picture" we either use that or the opposing option"

OPTION C: If realtime Image generation Preview is enable, we can add a button to force Adetailer to skip whatever current detected feature it is processing (this would be handy as an option always already)

Additional context

No response

Bing-su commented 11 months ago

adetailer is just a tool that does inpainting, so if you set up inpainting in img2img and also turn on adetailer, it's just duplication. Turn off adetailer when you do inpainting.

Ubermeisters commented 11 months ago

adetailer is just a tool that does inpainting, so if you set up inpainting in img2img and also turn on adetailer, it's just duplication. Turn off adetailer when you do inpainting.

Right but it processes more area than needed, there is no way to set a mask for the detection. Based on inpainting region?

Ubermeisters commented 11 months ago

Allow me to explain better. I'm sure I just don't have an intimate enough understanding of how any of this works, but to be certain, I want to explain a real life scenario I'm fighting with.

I'm using Automatic1111 on my windows PC , GPU =3060ti. I'm processing some family portraits for a friend as a holiday gift.

I'm using the ToonYou stable diffusion 1.5 checkpoint, and a Lora network to stylize in the theme of Studio Ghibli.

I haven't found a method to just use IMG2IMG to convert the portraits to a nice Ghibli styled images, all in one pass. I've found i need to use inpainting, and composite process the image, background first, then the person(s). If i done use Adetailer, the faces are pretty awful. Using Adetailer, as you are well aware, I can salvage the recognizable features, at low denoising strength (~0.03), and not have a family of freaks with 3 eyes or whatever. I've not had luck using the hand only adetailer model, for this specific purpose. it doesn't handle the toon shading well or something. The person models work much better for hands in my use case.

If I inpaint just a face, with the person adetailer model enabled, during the generation, adetailer will also pause to solve the hands, even if they are not within my inpainting mask. If I change the inpainting more to "only masked" then the styling doesn't get enough information from the surrounding scene, and doesn't blend into the background properly etc.

This doesn't sound like a huge issue I'm sure, since who can't wait for one set of hands, right?

Well, Currently I am working on a 6-person image, and let me tell you, its NOT fun. I have to mask one person at a time, or the results are too far away from the likenesses I'm attempting to retain. Adetailer doesn't seem to be in on the fact that inpainting involves a masked region. I think it should only process the masked region, even when inpainting options are set to "inpainting area "whole picture", because the processing time adds up fast.

Example: love how everyone is getting a mustache image

In the above image, I'm masking the dad. I've got text descriptors specific to the dad. The generation process is already slow due to me pushing the very bounds of what my GPU memory can handle. CPU only mode would be death. The amount of time that adetailer takes, finding all 6 persons, and generating them, is insane. without Adetailer enabled, this takes less than 2 minutes, but isn't usable in my case. With adetailer enabled, its taking ~25mins per generation.

I understand you say its a tool that does inpainting, so why can't we get the inpainting results to be confined to the mask, like the normal inpainting process?

I'm bad at technical explanations, and I am way over my head even submitting a feature request, and if it sounds like i expect things, I'm sorry, its not my intent. I love the work that has been done to make adetailer, and its phenomenal to me that its free. You owe me absolutely nothing, to be clear. I'm just trying to make a case for a feature that I think would dramatically improve the image generation time, in many scenarios. For all I know, this is technically impossible, and I sound like a moron right now haha.

Thanks for your time.

GuruVirus commented 11 months ago

Adetailer does not interface with any of the inpainting settings selected. It runs its own. So it sounds like more simply put you're asking for an option to have the adetailer pipeline region detection to be limited within the inpaint mask area.

I personally don't think this is feasible as this extension runs after inpaint (or t2i or i2i) is already completed.