vladmandic / automatic

SD.Next: Advanced Implementation of Stable Diffusion and other Diffusion-based generative image models
https://github.com/vladmandic/automatic
GNU Affero General Public License v3.0
5.34k stars 381 forks source link

[Feature]: stable code api for batch processing #528

Open ljleb opened 1 year ago

ljleb commented 1 year ago

Feature description

At the moment, the code for img2img batch sequence processing is restricted to this function:

https://github.com/vladmandic/automatic/blob/93b0de7e599453027ad7cab6266b42920ebc1250/modules/img2img.py#L16

And it is impossible to know whether we are in batch mode or not from an extension without resorting to hacks. See for example how it was done for controlnet:

https://github.com/Mikubill/sd-webui-controlnet/blob/93b0f9e1b7cc246165666b7b307bc8243db2c3f4/scripts/batch_hijack.py#L99

Additionally, this batch sequence mode and the "batch count" and "batch size" sliders of the webui can easily be confused with one another.

It would be nice if it was possible for extension scripts to:

  1. know whether they are in sequential batch mode
  2. control whether the current generation is in batch mode, depending on the extension generation parameters (the gradio components returned by the ui() callback)
    • note also that this means extensions would be able to enter batch mode during txt2img generation for example, which is currently not an existing concept in the webui

Providing a unified interface to iterate over multiple images for generation would allow extensions to communicate with each other and make it possible (or at least easier) to create more complex workflows for movie2movie that are currently not possible (or at least hard to implement).

As a side concern: should I open this issue in https://github.com/AUTOMATIC1111/stable-diffusion-webui instead? I am confused as to which repo should receive this kind of code change request.

Version Platform Description

-

vladmandic commented 1 year ago

Overal, a valid request. I'd love to hear more on the suggested design (if any)?

As a side concern: should I open this issue in https://github.com/AUTOMATIC1111/stable-diffusion-webui instead? I am confused as to which repo should receive this kind of code change request.

I know its tedious, but ideally both. This repo became as fork of A1111, but its quickly evolving while A1111 did not have any commits for over a month now. So any PR to A1111 would likely sit there for a while. Quick & safe is that if you're wiling to make a proposal, I'd review it and merge - then you can do the same PR for A1111.

ljleb commented 1 year ago

I have not looked into how to make this work concretely to be honest. I think there is a way as there isn't much exising code for batch sequence processing.

I can look into different designs and make suggestions later next week. I won't be able to glaze at or push code for some time.

vladmandic commented 1 year ago

to access information, what do you think about having this information out-of-band? for example, extending shared.state object with additional properties that can be fetched either directly (shared.state is safe to access directly from extensions) and via api /sdapi/v1/progress

still leaves actually defining clean batch method that extensions could use. not really a problem, but i'd like to understand the desired workflow a bit more so i don't miss something.

ljleb commented 1 year ago

Well there are existing scripts at the moment that will call process_images multiple times in a row. Ideally these should work in conjunction with the new design for batch sequence processing. I'd think it is possible, but if it is impossible or too hard then I'd consider this optional.

A clear example of something we want to be able to do is this: https://github.com/Mikubill/sd-webui-controlnet/pull/683#issuecomment-1533207733 I think they made a good job of describing their desired workflow. The point of the new design is to make it possible for extensions to contribute to some batch sequence process in different ways, rather than taking ownership of it.

I don't have a specific design in mind, I'll take a look at it and maybe open a PR in a1111 if I have something I think is good enough for both repos.

ljleb commented 1 year ago

we can rely on global state for whether we are in batch mode or not, I think this can be useful. However, to update the state or the generation settings, we may want instead to provide facilities directly inside the scripts, for example by providing callbacks similar to process, process_batch etc. but for batch processing instead.