AUTOMATIC1111 / stable-diffusion-webui

Stable Diffusion web UI
GNU Affero General Public License v3.0
137.03k stars 26.08k forks source link

[Bug]: Generation is stuck at ~97% indefinitely when CLI Total Progress is at 100% #4791

Open rackSD opened 1 year ago

rackSD commented 1 year ago

Is there an existing issue for this?

What happened?

Generations will sometimes get stuck on ~97% complete in the UI during generate/generate forever. During this period, the Total Progress in the CLI will have reached and remains at 100% for several minutes. During this 'stuck' period, the GPU is still under load, but usage % and temperatures gradually drop.

After encountering the bug a few more times, I believe this has something to do with the txt2img tab. Doing anything that reloads the it will resolve the 'stuck' state of the generation. This includes but is not limited to: refreshing the browser page, switching to different tabs (img2img, PNG Info, etc) then back to txt2img, and using the Restart Gradio and Refresh Components button.

The previously stuck generation will also promptly appear as a complete image in the /Output folder once any of the described actions above are taken.

After a refresh/Image Browser page load, the generation will be able to begin as normal.

Given a long enough time, the UI will occasionally output the image, but the total time taken for a single generation is several times longer than normal (1 minute per image balloons to 8 minutes per image). Following that, the CLI it/s updates from a baseline of 1.5-3.5 iterations per second to a significantly slower value of ~8 to 10 seconds per iteration. This value was previously static once the generation gets stuck, and only updates once the image is outputted.

The below is a gif of the 97% stuck-100 % CLI complete bug. View in HD otherwise the gif is bloody small.

https://gfycat.com/grossdampbarasinga

Steps to reproduce the problem

This bug is not consistently reproducible, I was unable to identify the cause. Once the first occurrence begins, it is likely to occur for the next few generations in a row, persisting past restarting the WebUI process/hard refresh browser.

  1. Click generate/generate forever
  2. Wait for generation to be stuck at ~97%.
  3. Either refresh the page with F5 or switch to another tab (img2img, PNGInfo, etc) and back.
  4. Generation completes, image file is created at /output/ folder and am now able to start generating the next image.

What should have happened?

Generations should not be stuck at 97% while the CLI has been at 100% for the past 10 minutes, when an image at identical settings usually takes 1 minute to output.

Commit where the problem happens

98947d173e3f1667eba29c904f681047dea9de90

What platforms do you use to access UI ?

Windows

What browsers do you use to access the UI ?

Mozilla Firefox

Command Line Arguments

No command line arguments.

Additional information, context and logs

No response

2793145003 commented 1 year ago

I don't know if I have same problem... but I often get stucked in generating, expecially when I run inpainting, at any progress in ui, maybe 45%, or usually 95%. I get progress 100% in command line and result images in output folder, but web ui keep showing interrupt/skip forever, and both these two buttons not work.

rackSD commented 1 year ago

What you described sounds like the same problem as I am encountering.

I am pretty convinced that it is an issue with the UI due to how switching tabs resolves the issue. This problem only began a week or so back, and was not present before that time.

molkemon commented 1 year ago

I am having the same issue but for me it resolves if I switch to a different browser Tab.

The error works like this. Gen anything, works fine. Gen anything, then while it is genning if I click the previous gen so it goes "fullscreen" and then make it small againn a weird timer thing overlaps the prompt info below the gen image and the gen never completes despite 100%, UNTIL YOU SWITCH TO A DIFFERENT BROWSER TAB. Then it will complete and you can continue as normal.

Bug can be prevented by not fullscreening previous gen while genninga new one. Very weird bug.

fsilveiradev commented 1 year ago

I can confirm the issue still persists, as I am having the same symptoms with automatic1111, and extensions, updated.

FutonGama commented 1 year ago

This problem is happening to me for months already and i don't know how to fix. I have to change the tab for finish every generation, i'm getting crazy to solve this and nothing.

Sousheyyy commented 1 year ago

Still haveing the same issue. Stuck at %98 everytime

zpengcom commented 1 year ago

Stuck at 95% 3S The terminal shows 100%, no files are written to the outputs\txt2img-images directory

version: [v1.5.1] •  python: 3.10.10  •  torch: 2.0.1+cu117  •  xformers: 0.0.20  •  gradio: 3.32.0

zpengcom commented 1 year ago

Stuck at 95% 3S 卡在 95% 3S The terminal shows 100%, no files are written to the outputs\txt2img-images directory终端显示100%,没有文件写入outputs\txt2img-images目录

version: [v1.5.1] •  python: 3.10.10  •  torch: 2.0.1+cu117  •  xformers: 0.0.20  •  gradio: 3.32.0版本:[v1.5.1] •  python:3.10.10  •  torch:2.0.1+cu117  •  xformers:0.0.20  •  梯度:3.32.0

I tried reinstalling update dependencies, no effect then disabling all extensions - problem solved so I tried to troubleshoot problem extensions until it: image problem solved By the way, when I switched to the SDXL model, it seemed to have a few minutes of stutter at 95%, but the results were ok

stevieraykatz commented 12 months ago

This might be a silly fix for some, but I noticed that it was hanging on img save and by creating the outputs directory manually, it started completing generations. Hope this helps someone!

ZmeyGer commented 11 months ago

Same mistake. Freezes at 90 - 98% while the terminal shows 100%. Manual creation of output folder does not help. Switching interface and browser tabs does not solve the problem. Refreshing the page by F5 leads to a complete reset of generation results without a picture appearing in the result folder. I tried to rollback to previous commits At some point, it seemed that the problem was solved, but today it hangs at 98% again.

Browsers tried: MSEdge, Opera, Brave.

Attempts to "play" with command line parameters known to me did not succeed.

Maybe the choice of sampling method has some influence on this? Before the hang I was using "Eiler A", and the hang happened after switching to "DPM++ 2M SDE Karras".

ppetrucz commented 11 months ago

Same, stuck at around 97%. Switching tab (edge) then back completes the gen.

zpengcom commented 11 months ago

Stuck at 95% 3S 卡在 95% 3S The terminal shows 100%, no files are written to the outputs\txt2img-images directory终端显示100%,没有文件写入outputs\txt2img-images目录Stuck at 95% 3S 卡在 95% 3S The terminal shows 100%, no files are written to the outputs\txt2img-images directory终端显示100%,没有文件写入outputs\txt2img-images目录 version: [v1.5.1] •  python: 3.10.10  •  torch: 2.0.1+cu117  •  xformers: 0.0.20  •  gradio: 3.32.0版本:[v1.5.1] •  python:3.10.10  •  torch:2.0.1+cu117  •  xformers:0.0.20  •  梯度:3.32.0version: [v1.5.1] •  python: 3.10.10  •  torch: 2.0.1+cu117  •  xformers: 0.0.20  •  gradio: 3.32.0版本:[v1.5.1] •  python:3.10.10  •  torch:2.0.1+cu117  •  xformers:0.0.20  •  梯度:3.32.0

I tried reinstalling update dependencies, no effect我尝试重新安装更新依赖项,没有效果 then disabling all extensions - problem solved然后禁用所有扩展 - 问题解决 so I tried to troubleshoot problem extensions until it: 所以我尝试解决问题扩展,直到: image problem solved 问题解决了 By the way, when I switched to the SDXL model, it seemed to have a few minutes of stutter at 95%, but the results were ok顺便说一句,当我切换到SDXL型号时,在95%时似乎有几分钟的卡顿,但结果还可以

我可能 只是

Stuck at 95% 3S 卡在 95% 3S The terminal shows 100%, no files are written to the outputs\txt2img-images directory终端显示100%,没有文件写入outputs\txt2img-images目录 version: [v1.5.1] •  python: 3.10.10  •  torch: 2.0.1+cu117  •  xformers: 0.0.20  •  gradio: 3.32.0版本:[v1.5.1] •  python:3.10.10  •  torch:2.0.1+cu117  •  xformers:0.0.20  •  梯度:3.32.0

I tried reinstalling update dependencies, no effect then disabling all extensions - problem solved so I tried to troubleshoot problem extensions until it: image problem solved By the way, when I switched to the SDXL model, it seemed to have a few minutes of stutter at 95%, but the results were ok

I might just have a bad hard drive :(

ppetrucz commented 11 months ago

I was able to resolve it by deleting no-half-vae from the args. Now it works just fine.

ghostsquad commented 10 months ago

I'm seeing something similar, and the commandline logs show this (as an example of txt2img)

To create a public link, set `share=True` in `launch()`.
Startup time: 97.0s (prepare environment: 61.2s, import torch: 3.3s, import gradio: 1.3s, setup paths: 0.6s, initialize shared: 0.2s, other imports: 1.3s, setup codeformer: 0.2s, list SD models: 2.6s, load scripts: 7.3s, create ui: 14.4s, gradio launch: 4.5s, app_started_callback: 0.2s).
WARNING:multipart.multipart:Consuming a byte in the end state
WARNING:multipart.multipart:Consuming a byte in the end state
2023-09-17 16:39:28,854 - ControlNet - WARNING - Failed to parse infotext, legacy format infotext is no longer supported:
Module: dw_openpose_full,Model: control_v11p_sd15_openpose [cab727d4],Weight:1,Resize Mode: Crop and Resize,Low Vram: False,Processor Res:512,Guidance Start:0,Guidance End:1,Pixel Perfect: False,Control Mode: Balanced
INFO:sd_dynamic_prompts.dynamic_prompting:Prompt matrix will create 4 images in a total of 1 batches.
100%|██████████████████████████████████████████████████████████████████████████████████| 14/14 [00:09<00:00,  1.55it/s]
Total progress: 100%|██████████████████████████████████████████████████████████████████| 14/14 [00:07<00:00,  1.80it/s] 

The images aren't written to disk. So it appears that this is the problem. There's something going on when trying to write to disk.

Changing the "Live Preview" from Full to TAESD seems to have resolved the issue.

installation data:

version: [v1.6.0](https://github.com/AUTOMATIC1111/stable-diffusion-webui/commit/5ef669de080814067961f28357256e8fe27544f4)  •  python: 3.10.11  •  torch: 2.0.1+cu118  •  xformers: 0.0.20  •  gradio: 3.41.2  •  checkpoint: [ec41bd2a82](https://google.com/search?q=ec41bd2a8271acde4ae81cac004d9f3300e7fb0870eae8cbfe0bbc4ef8e27f91)
Trick42-1980 commented 6 months ago

https://nvidia.custhelp.com/app/answers/detail/a_id/5490

ilcane87 commented 6 months ago

https://nvidia.custhelp.com/app/answers/detail/a_id/5490

Seems unrelated to me.

Trick42-1980 commented 6 months ago

https://nvidia.custhelp.com/app/answers/detail/a_id/5490 unfortunately, the pictures are barely visible, you have to choose: ph It works for me.

ilcane87 commented 6 months ago

It works for me.

I've had that setting enabled for a long time, and all that time I've still had generations randomly getting stuck, forcing me to restart the webui. Besides, that setting only takes effect when you're consuming a lot of VRAM, while this issue happens even on single image generations that use very little of it (and I've got 24GB).

So either you had a different issue that was related to VRAM, or your issue is still there and it just hasn't happened to you yet since enabling the setting.

Trick42-1980 commented 6 months ago

Only yesterday I found that solution to the problem. I performed several operations requiring large vrams, there were no problems. It worked the first time, I haven't tested it since. I will write if I experience anything further.