lllyasviel / ControlNet

Let us control diffusion models!
Apache License 2.0
30.35k stars 2.73k forks source link

[Feature Request]: Batch script for txt2img for higher quality generations from multiple image sources/animation frames OR ... #171

Open marcsyp opened 1 year ago

marcsyp commented 1 year ago

Is there an existing issue for this? I have searched the existing issues and checked the recent builds/commits What would your feature do ? I have been experimenting with using img2img with a source image and the same image to control all of my controlnet networks vs doing the same exact setup in txt2img (i.e., without the source image as a starting point) -- and I have found the txt2img to provide consistently better results, even after playing with weights, etc, in img2img.

However, txt2img doesn't currently allow you to process images in batch (makes sense). Current workflow to process a large number of images is painful, involving replacing the source image for each control net one by one, clicking generate, and then moving to the next (at least this is less painful using the new tab interface, but still painful).

Proposed workflow Load a "ControlNet Batch" script Provide a directory of source images (ideally one directory for each controlnet, up to 10) For each image, the script replaces the controlnet source image (for each controlnet) with the corresponding image for that directory and runs a txt2img, moves to the next. Additional Features (Nice to haves) ability to vary the prompt by index position (directory for text files with different prompts with a way to assign each file to multiple index positions, with the A1111 input as a fallback?)

Another thought:

This may be possible to accomplish in img2img by simply providing a checkbox in the controlnet UI to ignore source image (may need to be a global checkbox?)


Just a few use cases to think about regarding this feature. These particularly related to animation, but could easily be applied to other use cases.

ANIMATION

Using the normal mapping with a background or foreground threshold to isolate subject matter. This works really well in txt2img, but fails miserably in img2img because the background content being isolated still provides weighted noise to the controlnet. using low resolution preprocessing to remove detail from a scene but maintain overall coherence, which works much better in txt2img. BATCH IMAGES

Using standard txt2img results with excellent composition as a low-fidelity proxy for a txt2img controlnet pass that adds high levels of detail without polluting the result with garbage pixel data. The ability to process a batch of content that has been preprocessed elsewhere (for instance, normal maps or depth maps) produced in external applications (Blender, etc), without needing to use the maps themselves as source data in img2img (which pollutes the controlnet result)

Thanks!

johndpope commented 1 year ago

related? - https://github.com/Mikubill/sd-webui-controlnet/issues/296 https://github.com/Mikubill/sd-webui-controlnet/issues/243

control + batch https://www.youtube.com/watch?v=3FZuJdJGFfE&ab_channel=OlivioSarikas

enn-nafnlaus commented 1 year ago

I too could use this feature.

joshdstanton commented 1 year ago

@marcsyp I've been working on the same thing and have been very frustrated by the poor results of img2img vs txt2img. Would love a batch option in txt2image using controlnet. Hoping this feature is released soon.

marcsyp commented 1 year ago

@joshdstanton The video referenced above was actually helpful for me -- at around 7:10 he demonstrates how removing the source images from the img2img tab and controlnets actually accomplishes the task described in this issue. Combined with the checkbox in the ControlNet settings tab to "skip img2img processing", you can get what seems to be txt2img results in batch.

That said, I believe features with multiple controlnet source directories are being worked on to allow finer grained control of batch processing with multiple controlnet types/sources, which will add a whole new level of power.

enn-nafnlaus commented 1 year ago

That video above isn't at all helpful for me. It's an insanely long way to just describe "standard batch img2img, except with ControlNet enabled".

That's not what we need. We need to be able to batch-specify control net images. Not source images. That is, to say, to be able to use ControlNet in txt2img without a source image; or, to use ControlNet in img2img with a net that's not derived from the base image.

marcsyp commented 1 year ago

That video above isn't at all helpful for me. It's an insanely long way to just describe "standard batch img2img, except with ControlNet enabled".

That's not what we need. We need to be able to batch-specify control net images. Not source images. That is, to say, to be able to use ControlNet in txt2img without a source image;

This first part IS possible -- using the two key workflow points I described (removing source images from the img2img and turning on "skip img2img processing" in Settings.

or, to use ControlNet in img2img with a net that's not derived from the base image.

As I said, this is what is not currently possible, but is being worked on. Discussion here: https://github.com/Mikubill/sd-webui-controlnet/issues/243.

enn-nafnlaus commented 1 year ago

This first part IS possible -- using the two key workflow points I described (removing source images from the img2img and turning on "skip img2img processing" in Settings.

Unfortunately, that isn't what's being talked about. That lets you process many images with one mask. Not a batch of masks. There is no way to provide a batch of masks.

Good to know that the proper solution is being worked on, though :)

marcsyp commented 1 year ago

Just a note that I've found a way to properly hack a batch run of multi-controlnet in txt2txt, at least for images that are all of the same pixel dimensions.

A trick that works: Use the controlnet m2m script with mov or mp4 files that are assembled (via ffmpeg, etc) using the images that you would like to batch. You can use different mov files for different controlnets with source imagery and depth maps or masked content, etc.

It's not a perfect solution, and it doesn't solve enn-nafnlaus' prompt travel concerns expressed elsewhere, but it does let us experiment in the meantime while proper batching is being developed.

jimys commented 1 year ago

Just a note that I've found a way to properly hack a batch run of multi-controlnet in txt2txt, at least for images that are all of the same pixel dimensions.

A trick that works: Use the controlnet m2m script with mov or mp4 files that are assembled (via ffmpeg, etc) using the images that you would like to batch. You can use different mov files for different controlnets with source imagery and depth maps or masked content, etc.

It's not a perfect solution, and it doesn't solve enn-nafnlaus' prompt travel concerns expressed elsewhere, but it does let us experiment in the meantime while proper batching is being developed.

when I use m2m, I can't stop the progress, I had to restart webui every time