vladmandic / automatic

SD.Next: Advanced Implementation of Stable Diffusion and other Diffusion-based generative image models
https://github.com/vladmandic/automatic
GNU Affero General Public License v3.0
5.5k stars 400 forks source link

[Issue]: Control: larger batches (~5 and over) cause frontend and backend to go out of sync (Firefox) #3087

Open lbeltrame opened 4 months ago

lbeltrame commented 4 months ago

Issue Description

I thought this was an issue of the new Modern UI, but it is instead an issue in Control.

When doing the txt2img Control workflow with larger batches (2 doesn't seem to trigger it reliably, 5 does), the frontend and the backend will go out of sync at the end, causing a Finishing text being displayed forever: further generations are not possible. This at least occurs with Firefox, the only browser I can test with. A page reload is required to allow further generations.

This was hardly noticeable in the old UI, because the Generate buttons were separate, but it is very evident in Modern UI where there's a single Generate button for everything so all workflows are impacted. In retrospect, I remember this occurring in the past, but I never figured how to trigger it properly until the Modern UI was introdued.

I haven't been able to pinpoint the actual cause. I checked, after being asked by Vlad, the difference in time between the end of generation, and end processing (paths removed):

11:58:33-719016 INFO     LoRA apply: ['great_lighting', 'xl_more_art-full_v1', 'Difference_AnimeFace', 'noribsXL_001_4'] patch=0.00 load=2.84                                                                                                               
11:58:33-752048 INFO     Base: class=StableDiffusionXLPipeline                                                                                                                                                                                              
Progress  2.00it/s █████████████████████████████████ 100% 20/20 00:09 00:00 Base
11:58:44-314487 INFO     Upscale: upscaler="ESRGAN 4x Ultrasharp" resize=0x0 upscale=1664x2432                                                                                                                                                              
11:58:44-735397 INFO     High memory utilization: GPU=63% RAM=33% {'ram': {'used': 10.18, 'total': 31.27}, 'gpu': {'used': 10.14, 'total': 15.98}, 'retries': 0, 'oom': 0}                                                                                  
Upscaling ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00 0:00:03
11:58:48-760529 INFO     HiRes: class=StableDiffusionXLImg2ImgPipeline sampler="DPM++ 2M"                                                                                                                                                                   
Progress  2.40s/it ████████████████████████████████ 100% 15/15 00:35 00:00 Hires
11:59:26-333107 INFO     High memory utilization: GPU=67% RAM=33% {'ram': {'used': 10.18, 'total': 31.27}, 'gpu': {'used': 10.71, 'total': 15.98}, 'retries': 0, 'oom': 0}                                                                                  
11:59:28-883667 INFO     Saving: image="XXX.webp" type=WEBP resolution=1664x2432 size=0                                                             
11:59:34-295319 INFO     High memory utilization: GPU=61% RAM=33% {'ram': {'used': 10.18, 'total': 31.27}, 'gpu': {'used': 9.81, 'total': 15.98}, 'retries': 0, 'oom': 0}                                                                                   
11:59:34-640459 INFO     Processed: images=5 time=319.14 its=0.31 memory={'ram': {'used': 10.18, 'total': 31.27}, 'gpu': {'used': 7.73, 'total': 15.98}, 'retries': 0, 'oom': 0}                                                                            
11:59:34-678473 INFO     Saving: image="XXX-grid.jpg" type=JPEG resolution=8320x2432 size=0  

So there's roughly 50 seconds (probably less) between the end of the generation and the actual end of the run. There are no other information in the server log, and as well in the JS console.

This does not occur with the regular txt2img workflow (but it works in a completely different way, so it's kind of expected).

Version Platform Description

11:45:47-084150 INFO Starting SD.Next
11:45:47-087137 INFO Logger: file="/home/lb/Coding/automatic/sdnext.log" level=INFO size=22296660 mode=append
11:45:47-088236 INFO Python 3.11.9 on Linux
11:45:47-158504 INFO Version: app=sd.next updated=2024-04-26 hash=cccbb4b3 branch=dev url=https://github.com/vladmandic/automatic/tree/dev
11:45:47-526947 INFO Updating main repository
11:45:48-141684 INFO Upgraded to version: cccbb4b3 Fri Apr 26 21:32:34 2024 -0400
11:45:48-154257 INFO Platform: arch=x86_64 cpu=x86_64 system=Linux release=6.8.7-1-default python=3.11.9

Relevant log output

No response

Backend

Diffusers

Branch

Dev

Model

SD-XL

Acknowledgements

vladmandic commented 4 months ago

i cannot reproduce this

lbeltrame commented 4 months ago

Are you trying with or without queues? I run with queues enabled, FTR. I've seen this in two different installations (one local, one on Paperspace). The fact that it depends on the number of images generated makes me think it's more of a client-side problem, although I have no idea on how to debug it.

vladmandic commented 4 months ago

tried with and without queues.

lbeltrame commented 4 months ago

I think that's something that gets stuck, because if I cut off the connection for a second, the problem magically solves itself. I'll keep this issue open for a bit but if no one else can reproduce it I'll close it and reopen only if I find a clean way to reproduce it.

lanice commented 4 months ago

I have the same / a similar issue. I don't even need to increase the batch count. If I do a normal text generation (in the Control tab) at 1024x1024, the process gets stuck at "control 100% finishing". If I do the same but as 512x512, no problem. In both cases the image is actually saved, so it's only the frontend that gets stuck.

By the way I think it is related to what I reported in #2838 before. Back then I thought it was the act of using ControlNet, but it's actually just using the Control tab. Which I also only now noticed when trying out Modern UI, since there is no more txt2img.

What I am wondering, what is the difference of using txt2img generation via the txt2img tab compared to via the Control tab? I previously thought that Control just uses the same workflow as txt2img under the hood, if only using a text prompt?

vladmandic commented 4 months ago

that's a very different issue.