vladmandic / automatic

SD.Next: Advanced Implementation Generative Image Models
https://github.com/vladmandic/automatic
GNU Affero General Public License v3.0
5.77k stars 431 forks source link

[Issue]: Hi-Res Fix takes for ever after upgrading #1638

Closed Rojinski closed 1 year ago

Rojinski commented 1 year ago

Issue Description

Hello everyone, hello to the Vlad Team :)

Since i've upgraded, the hi-res fix (x2) takes 10x more time than it took before... Generating a picture (768x768, restore face, xformers, fp16) takes 15 secs... Hi-Res fix (x2) can take 30 minutes now.... I have a RTX 3070i... The very same operation on Automatic 1111 (same seed) takes 2 mins max. I use 4x Ultra-Sharp or ESRGAN 4x+, 100 hires steps)

Any idea?

Thanks on advance for your help.

25%|███████████████████▊ | 25/100 [16:40<1:00:06, 48.09s/it]

Version Platform Description

Using VENV: D:\Programmes 2\Logiciels\Vlad\automatic\venv 21:10:08-473657 INFO Starting SD.Next 21:10:08-480691 INFO Python 3.10.10 on Windows 21:10:08-517885 INFO Version: 75a8c1f9 Mon Jul 10 11:44:52 2023 -0400 21:10:08-930911 INFO nVidia CUDA toolkit detected 21:10:10-210115 INFO Torch 2.0.1+cu118 21:10:10-223619 INFO Torch backend: nVidia CUDA 11.8 cuDNN 8700 21:10:10-225191 INFO Torch detected GPU: NVIDIA GeForce RTX 3070 Ti VRAM 8192 Arch (8, 6) Cores 48 21:10:10-345943 INFO Enabled extensions-builtin: ['a1111-sd-webui-lycoris', 'clip-interrogator-ext', 'LDSR', 'Lora', 'multidiffusion-upscaler-for-automatic1111', 'ScuNET', 'sd-dynamic-thresholding', 'sd-extension-aesthetic-scorer', 'sd-extension-steps-animation', 'sd-extension-system-info', 'sd-webui-agent-scheduler', 'sd-webui-controlnet', 'sd-webui-model-converter', 'seed_travel', 'stable-diffusion-webui-images-browser', 'stable-diffusion-webui-rembg', 'SwinIR'] 21:10:10-349945 INFO Enabled extensions: ['adetailer', 'Config-Presets', 'model-keyword', 'openpose-editor', 'SD-CN-Animation', 'sd-dynamic-prompts', 'sd-webui-3d-open-pose-editor', 'sd-webui-additional-networks', 'sd-webui-aspect-ratio-helper', 'sd-webui-roop', 'sd_dreambooth_extension', 'ultimate-upscale-for-automatic1111'] 21:10:10-354941 INFO Verifying requirements 21:10:10-363806 WARNING Package wrong version: accelerate 0.19.0 required 0.20.3 21:10:10-364806 INFO Installing package: accelerate==0.20.3 21:10:12-541695 WARNING Package wrong version: diffusers 0.16.1 required 0.18.1 21:10:12-543227 INFO Installing package: diffusers==0.18.1 21:10:16-196587 INFO Verifying packages 21:10:16-198587 INFO Verifying repositories 21:10:19-233099 INFO Verifying submodules 21:10:43-839687 INFO Extension installed packages: sd_dreambooth_extension ['accelerate==0.19.0', 'diffusers==0.16.1'] 21:10:43-934736 INFO Extensions enabled: ['a1111-sd-webui-lycoris', 'clip-interrogator-ext', 'LDSR', 'Lora', 'multidiffusion-upscaler-for-automatic1111', 'ScuNET', 'sd-dynamic-thresholding', 'sd-extension-aesthetic-scorer', 'sd-extension-steps-animation', 'sd-extension-system-info', 'sd-webui-agent-scheduler', 'sd-webui-controlnet', 'sd-webui-model-converter', 'seed_travel', 'stable-diffusion-webui-images-browser', 'stable-diffusion-webui-rembg', 'SwinIR', 'adetailer', 'Config-Presets', 'model-keyword', 'openpose-editor', 'SD-CN-Animation', 'sd-dynamic-prompts', 'sd-webui-3d-open-pose-editor', 'sd-webui-additional-networks', 'sd-webui-aspect-ratio-helper', 'sd-webui-roop', 'sd_dreambooth_extension', 'ultimate-upscale-for-automatic1111'] 21:10:43-937085 INFO Verifying packages 21:10:43-942088 INFO Extension preload: 0.0s D:\Programmes 2\Logiciels\Vlad\automatic\extensions-builtin 21:10:43-944601 INFO Extension preload: 0.0s D:\Programmes 2\Logiciels\Vlad\automatic\extensions 21:10:43-954610 INFO Server arguments: [] 21:10:47-839464 INFO Pipeline: Backend.ORIGINAL 21:10:48-721833 INFO Libraries loaded 21:10:48-723840 INFO Using data path: D:\Programmes 2\Logiciels\Vlad\automatic 21:10:48-731873 INFO Available VAEs: D:\Programmes 2\Logiciels\stable-diffusion-webui\models\VAE 16 21:10:48-862242 INFO Available models: D:\Programmes 2\Logiciels\stable-diffusion-webui\models\Stable-diffusion 305 21:10:50-836072 INFO ControlNet v1.1.232 ControlNet v1.1.232 ControlNet preprocessor location: D:\Programmes 2\Logiciels\Vlad\automatic\extensions-builtin\sd-webui-controlnet\annotator\downloads 21:10:50-963382 INFO ControlNet v1.1.232 ControlNet v1.1.232 Image Browser: ImageReward is not installed, cannot be used. [-] ADetailer initialized. version: 23.7.5, num models: 9 21:10:52-589403 INFO Libraries loaded Libraries loaded [AddNet] Updating model hashes... 0it [00:00, ?it/s] [AddNet] Updating model hashes... 0it [00:00, ?it/s] 2023-07-10 21:10:53,201 - roop - INFO - roop v0.0.2 2023-07-10 21:10:53,202 - roop - INFO - roop v0.0.2 Loading weights: D:\Programmes 2\Logiciels\stable-diffusion-webui\models\Stable-diffusion\beautifulArt_v30.safetensors 21:10:53-495560 INFO Setting Torch parameters: dtype=torch.float16 vae=torch.float16 unet=torch.float16 Setting Torch parameters: dtype=torch.float16 vae=torch.float16 unet=torch.float16 LatentDiffusion: Running in eps-prediction mode DiffusionWrapper has 859.52 M params. Loading weights: D:\Programmes 2\Logiciels\stable-diffusion-webui\models\VAE\vae-ft-mse-840000-ema-pruned.safetensors 21:10:55-245797 INFO Applying xformers cross attention optimization Applying xformers cross attention optimization 21:10:55-403667 INFO Embeddings: loaded=249 skipped=29 Embeddings: loaded=249 skipped=29 21:10:55-409674 INFO Model loaded in 2.1s (load=0.1s create=0.3s apply=0.4s vae=0.6s move=0.4s embeddings=0.2s) Model loaded in 2.1s (load=0.1s create=0.3s apply=0.4s vae=0.6s move=0.4s embeddings=0.2s) 21:10:55-547320 INFO Model load finished: {'ram': {'used': 3.22, 'total': 31.88}, 'gpu': {'used': 3.13, 'total': 8.0}, 'retries': 0, 'oom': 0} cached=0 Model load finished: {'ram': {'used': 3.22, 'total': 31.88}, 'gpu': {'used': 3.13, 'total': 8.0}, 'retries': 0, 'oom': 0} cached=0 21:10:56-158304 INFO Loading UI theme: name=gradio/default style=Auto 21:10:57-652922 INFO Available models: D:\Programmes 2\Logiciels\stable-diffusion-webui\models\Stable-diffusion 305 CUDA SETUP: Loading binary D:\Programmes 2\Logiciels\Vlad\automatic\venv\lib\site-packages\bitsandbytes\libbitsandbytes_cudaall.dll... Running on local URL: http://127.0.0.1:7860 21:11:00-475846 INFO Local URL: http://127.0.0.1:7860/ 21:11:00-476854 INFO Initializing middleware 21:11:00-619851 INFO [AgentScheduler] Task queue is empty 21:11:00-620851 INFO [AgentScheduler] Registering APIs 21:11:01-135485 INFO Startup time: 17.2s (torch=2.7s gradio=0.8s libraries=1.2s models=0.1s scripts=7.1s onchange=0.1s ui-txt2img=0.7s ui-img2img=0.1s ui-settings=0.1s ui-extensions=3.0s ui-defaults=0.1s launch=0.3s app-started=0.3s checkpoint=0.4s)

Acknowledgements

brknsoul commented 1 year ago

Downgrade to nvidia driver 531.

Rojinski commented 1 year ago

Downgrade to nvidia driver 531.

Just for Vlad? It works well on everything else (A1111, invokeAI, etc) and i need my nvidia driver to be up-to-date. But thank you for your answer. ;)

Rojinski commented 1 year ago

Downgrade to nvidia driver 531. Ok, I've tried to roll back the driver but the option is greyed... :/ So, I can't downgrade it....

brknsoul commented 1 year ago

You'll have to actually uninstall the driver (DDU helps here) and find and download v531. The problem will happen on any form of Stable Diffusion.

Or you could wait until the next driver release which may fix the issue.

Rojinski commented 1 year ago

You'll have to actually uninstall the driver (DDU helps here) and find and download v531. The problem will happen on any form of Stable Diffusion.

Or you could wait until the next driver release which may fix the issue.

Thanks a lot for your help. I did it. Uninstall then install v531.... It's a lot faster indeed BUT i get a bunch of error messages in the console. It's not displaying the work anymore (transformation of the image). But, hey, i got a picture at the end.

100%|██████████████████████████████████████████████████████████████████████████████████| 40/40 [00:10<00:00, 3.70it/s] 40%|████████████████████████████████▊ | 16/40 [00:24<00:50, 2.11s/it]12:05:35-699059 ERROR API error: POST: http://127.0.0.1:7860/internal/progress {'error': 'OutOfMemoryError', 'detail': '', 'body': '', 'errors': 'CUDA out of memory. Tried to allocate 2.25 GiB (GPU 0; 8.00 GiB total capacity; 6.17 GiB already allocated; 691.00 MiB free; 6.20 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF'} 12:05:35-700706 ERROR HTTP API: OutOfMemoryError ╭───────────────────────────────────────── Traceback (most recent call last) ──────────────────────────────────────────╮│ D:\Programmes 2\Logiciels\Vlad\automatic\venv\lib\site-packages\anyio\streams\memory.py:98 in receive ││ ││ D:\Programmes 2\Logiciels\Vlad\automatic\venv\lib\site-packages\anyio\streams\memory.py:93 in receive_nowait │╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯WouldBlock

During handling of the above exception, another exception occurred:

╭───────────────────────────────────────── Traceback (most recent call last) ──────────────────────────────────────────╮│ D:\Programmes 2\Logiciels\Vlad\automatic\venv\lib\site-packages\starlette\middleware\base.py:78 in call_next ││ ││ D:\Programmes 2\Logiciels\Vlad\automatic\venv\lib\site-packages\anyio\streams\memory.py:118 in receive │╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯EndOfStream

During handling of the above exception, another exception occurred:

╭───────────────────────────────────────── Traceback (most recent call last) ──────────────────────────────────────────╮│ D:\Programmes 2\Logiciels\Vlad\automatic\modules\middleware.py:41 in log_and_time ││ ││ 40 │ │ │ ts = time.time() ││ ❱ 41 │ │ │ res: Response = await call_next(req) ││ 42 │ │ │ duration = str(round(time.time() - ts, 4)) ││ ││ D:\Programmes 2\Logiciels\Vlad\automatic\venv\lib\site-packages\starlette\middleware\base.py:84 in call_next ││ ││ ... 33 frames hidden ... ││ ││ D:\Programmes 2\Logiciels\Vlad\automatic\venv\lib\site-packages\torch\nn\modules\normalization.py:273 in forward ││ ││ 272 │ def forward(self, input: Tensor) -> Tensor: ││ ❱ 273 │ │ return F.group_norm( ││ 274 │ │ │ input, self.num_groups, self.weight, self.bias, self.eps) ││ ││ D:\Programmes 2\Logiciels\Vlad\automatic\venv\lib\site-packages\torch\nn\functional.py:2530 in group_norm ││ ││ 2529 │ _verify_batch_size([input.size(0) * input.size(1) // num_groups, num_groups] + list( ││ ❱ 2530 │ return torch.group_norm(input, num_groups, weight, bias, eps, torch.backends.cudnn.e ││ 2531 │╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯OutOfMemoryError: CUDA out of memory. Tried to allocate 2.25 GiB (GPU 0; 8.00 GiB total capacity; 6.17 GiB already allocated; 691.00 MiB free; 6.20 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try settingmax_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF 100%|██████████████████████████████████████████████████████████████████████████████████| 40/40 [00:57<00:00, 1.43s/it] 100%|██████████████████████████████████████████████████████████████████████████████████| 40/40 [00:11<00:00, 3.51it/s] 2%|█▋ | 2/100 [00:01<01:34, 1.03it/s]12:07:13-653882 ERROR API error: POST: http://127.0.0.1:7860/internal/progress {'error': 'OutOfMemoryError', 'detail': '', 'body': '', 'errors': 'CUDA out of memory. Tried to allocate 2.25 GiB (GPU 0; 8.00 GiB total capacity; 5.70 GiB already allocated; 1.13 GiB free; 5.75 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF'} 12:07:13-656486 ERROR HTTP API: OutOfMemoryError ╭───────────────────────────────────────── Traceback (most recent call last) ──────────────────────────────────────────╮│ D:\Programmes 2\Logiciels\Vlad\automatic\venv\lib\site-packages\anyio\streams\memory.py:98 in receive ││ ││ D:\Programmes 2\Logiciels\Vlad\automatic\venv\lib\site-packages\anyio\streams\memory.py:93 in receive_nowait │╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯WouldBlock

During handling of the above exception, another exception occurred:

╭───────────────────────────────────────── Traceback (most recent call last) ──────────────────────────────────────────╮│ D:\Programmes 2\Logiciels\Vlad\automatic\venv\lib\site-packages\starlette\middleware\base.py:78 in call_next ││ ││ D:\Programmes 2\Logiciels\Vlad\automatic\venv\lib\site-packages\anyio\streams\memory.py:118 in receive │╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯EndOfStream

During handling of the above exception, another exception occurred:

╭───────────────────────────────────────── Traceback (most recent call last) ──────────────────────────────────────────╮│ D:\Programmes 2\Logiciels\Vlad\automatic\modules\middleware.py:41 in log_and_time ││ ││ 40 │ │ │ ts = time.time() ││ ❱ 41 │ │ │ res: Response = await call_next(req) ││ 42 │ │ │ duration = str(round(time.time() - ts, 4)) ││ ││ D:\Programmes 2\Logiciels\Vlad\automatic\venv\lib\site-packages\starlette\middleware\base.py:84 in call_next ││ ││ ... 33 frames hidden ... ││ ││ D:\Programmes 2\Logiciels\Vlad\automatic\venv\lib\site-packages\torch\nn\modules\normalization.py:273 in forward ││ ││ 272 │ def forward(self, input: Tensor) -> Tensor: ││ ❱ 273 │ │ return F.group_norm( ││ 274 │ │ │ input, self.num_groups, self.weight, self.bias, self.eps) ││ ││ D:\Programmes 2\Logiciels\Vlad\automatic\venv\lib\site-packages\torch\nn\functional.py:2530 in group_norm ││ ││ 2529 │ _verify_batch_size([input.size(0) * input.size(1) // num_groups, num_groups] + list( ││ ❱ 2530 │ return torch.group_norm(input, num_groups, weight, bias, eps, torch.backends.cudnn.e ││ 2531 │╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯OutOfMemoryError: CUDA out of memory. Tried to allocate 2.25 GiB (GPU 0; 8.00 GiB total capacity; 5.70 GiB already allocated; 1.13 GiB free; 5.75 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF 100%|████████████████████████████████████████████████████████████████████████████████| 100/100 [01:27<00:00, 1.14it/s] 12:08:39-171451 WARNING GPU high memory utilization: 97% {'ram': {'used': 3.89, 'total': 31.88}, 'gpu': {'used': 7.73, 'total': 8.0}, 'retries': 19, 'oom': 2}

brknsoul commented 1 year ago

Please select the entire error message and use the add code button \< > in the editor

Rojinski commented 1 year ago

Please select the entire error message and use the add code button < > in the editor And while it's generating the hi-res fix, the button "generate" goes back to the "generate" status and no picture is shown.... So, i have to watch the console and i see only there that's still working.

`Using VENV: D:\Programmes 2\Logiciels\Vlad\automatic\venv 12:02:32-379872 INFO Starting SD.Next 12:02:32-387050 INFO Python 3.10.10 on Windows 12:02:32-453338 INFO Version: 75a8c1f9 Mon Jul 10 11:44:52 2023 -0400 12:02:32-933811 INFO Latest published version: a844a83d9daa9987295932c0db391ec7be5f2d32 2023-07-11T08:00:45Z 12:02:32-937777 INFO nVidia CUDA toolkit detected 12:02:37-915081 INFO Torch 2.0.1+cu118 12:02:37-988768 INFO Torch backend: nVidia CUDA 11.8 cuDNN 8700 12:02:37-989775 INFO Torch detected GPU: NVIDIA GeForce RTX 3070 Ti VRAM 8192 Arch (8, 6) Cores 48 12:02:38-196644 INFO Enabled extensions-builtin: ['a1111-sd-webui-lycoris', 'clip-interrogator-ext', 'LDSR', 'Lora', 'multidiffusion-upscaler-for-automatic1111', 'ScuNET', 'sd-dynamic-thresholding', 'sd-extension-aesthetic-scorer', 'sd-extension-steps-animation', 'sd-extension-system-info', 'sd-webui-agent-scheduler', 'sd-webui-controlnet', 'sd-webui-model-converter', 'seed_travel', 'stable-diffusion-webui-images-browser', 'stable-diffusion-webui-rembg', 'SwinIR'] 12:02:38-216999 INFO Enabled extensions: ['adetailer', 'Config-Presets', 'model-keyword', 'openpose-editor', 'SD-CN-Animation', 'sd-dynamic-prompts', 'sd-webui-3d-open-pose-editor', 'sd-webui-additional-networks', 'sd-webui-aspect-ratio-helper', 'sd-webui-roop', 'sd_dreambooth_extension', 'ultimate-upscale-for-automatic1111'] 12:02:38-255389 INFO Verifying requirements 12:02:38-267422 WARNING Package wrong version: accelerate 0.19.0 required 0.20.3 12:02:38-268392 INFO Installing package: accelerate==0.20.3 12:02:42-003968 WARNING Package wrong version: diffusers 0.16.1 required 0.18.1 12:02:42-005259 INFO Installing package: diffusers==0.18.1 12:02:45-572903 INFO Verifying packages 12:02:45-573902 INFO Verifying repositories 12:02:48-882539 INFO Verifying submodules 12:03:23-306720 INFO Extension installed packages: sd_dreambooth_extension ['accelerate==0.19.0', 'diffusers==0.16.1'] 12:03:23-401990 INFO Extensions enabled: ['a1111-sd-webui-lycoris', 'clip-interrogator-ext', 'LDSR', 'Lora', 'multidiffusion-upscaler-for-automatic1111', 'ScuNET', 'sd-dynamic-thresholding', 'sd-extension-aesthetic-scorer', 'sd-extension-steps-animation', 'sd-extension-system-info', 'sd-webui-agent-scheduler', 'sd-webui-controlnet', 'sd-webui-model-converter', 'seed_travel', 'stable-diffusion-webui-images-browser', 'stable-diffusion-webui-rembg', 'SwinIR', 'adetailer', 'Config-Presets', 'model-keyword', 'openpose-editor', 'SD-CN-Animation', 'sd-dynamic-prompts', 'sd-webui-3d-open-pose-editor', 'sd-webui-additional-networks', 'sd-webui-aspect-ratio-helper', 'sd-webui-roop', 'sd_dreambooth_extension', 'ultimate-upscale-for-automatic1111'] 12:03:23-403986 INFO Verifying packages 12:03:23-412330 INFO Extension preload: 0.0s D:\Programmes 2\Logiciels\Vlad\automatic\extensions-builtin 12:03:23-415329 INFO Extension preload: 0.0s D:\Programmes 2\Logiciels\Vlad\automatic\extensions 12:03:23-424338 INFO Server arguments: [] 12:03:27-474319 INFO Pipeline: Backend.ORIGINAL 12:03:29-167450 INFO Libraries loaded 12:03:29-168456 INFO Using data path: D:\Programmes 2\Logiciels\Vlad\automatic 12:03:29-188561 INFO Available VAEs: D:\Programmes 2\Logiciels\stable-diffusion-webui\models\VAE 16 12:03:32-445171 INFO Available models: D:\Programmes 2\Logiciels\stable-diffusion-webui\models\Stable-diffusion 305 12:03:34-822534 INFO ControlNet v1.1.232 ControlNet v1.1.232 ControlNet preprocessor location: D:\Programmes 2\Logiciels\Vlad\automatic\extensions-builtin\sd-webui-controlnet\annotator\downloads 12:03:35-376103 INFO ControlNet v1.1.232 ControlNet v1.1.232 Image Browser: ImageReward is not installed, cannot be used. [-] ADetailer initialized. version: 23.7.5, num models: 9 12:03:37-978561 INFO Libraries loaded Libraries loaded [AddNet] Updating model hashes... 0it [00:00, ?it/s] [AddNet] Updating model hashes... 0it [00:00, ?it/s] 2023-07-11 12:03:39,293 - roop - INFO - roop v0.0.2 2023-07-11 12:03:39,294 - roop - INFO - roop v0.0.2 Loading weights: D:\Programmes 2\Logiciels\stable-diffusion-webui\models\Stable-diffusion\beautifulArt_v30.safetensors 12:03:39-834676 INFO Setting Torch parameters: dtype=torch.float16 vae=torch.float16 unet=torch.float16 Setting Torch parameters: dtype=torch.float16 vae=torch.float16 unet=torch.float16 LatentDiffusion: Running in eps-prediction mode DiffusionWrapper has 859.52 M params. Loading weights: D:\Programmes 2\Logiciels\stable-diffusion-webui\models\VAE\vae-ft-mse-840000-ema-pruned.safetensors 12:03:48-681918 INFO Applying xformers cross attention optimization Applying xformers cross attention optimization 12:03:52-227454 INFO Embeddings: loaded=249 skipped=29 Embeddings: loaded=249 skipped=29 12:03:52-233485 INFO Model loaded in 12.8s (load=0.3s config=0.1s create=0.4s apply=6.5s vae=1.6s move=0.4s embeddings=3.5s) Model loaded in 12.8s (load=0.3s config=0.1s create=0.4s apply=6.5s vae=1.6s move=0.4s embeddings=3.5s) 12:03:52-382018 INFO Model load finished: {'ram': {'used': 3.22, 'total': 31.88}, 'gpu': {'used': 3.13, 'total': 8.0}, 'retries': 0, 'oom': 0} cached=0 Model load finished: {'ram': {'used': 3.22, 'total': 31.88}, 'gpu': {'used': 3.13, 'total': 8.0}, 'retries': 0, 'oom': 0} cached=0 12:03:53-018226 INFO Loading UI theme: name=gradio/default style=Auto 12:03:54-644963 INFO Available models: D:\Programmes 2\Logiciels\stable-diffusion-webui\models\Stable-diffusion 305 CUDA SETUP: Loading binary D:\Programmes 2\Logiciels\Vlad\automatic\venv\lib\site-packages\bitsandbytes\libbitsandbytes_cudaall.dll... Running on local URL: http://127.0.0.1:7860 12:03:57-656821 INFO Local URL: http://127.0.0.1:7860/ 12:03:57-659604 INFO Initializing middleware 12:03:57-803068 INFO [AgentScheduler] Task queue is empty 12:03:57-803932 INFO [AgentScheduler] Registering APIs 12:03:58-337040 INFO Startup time: 34.9s (torch=2.8s gradio=0.9s libraries=2.1s models=3.3s codeformer=0.1s scripts=20.3s onchange=0.1s ui-txt2img=0.8s ui-img2img=0.1s ui-settings=0.1s ui-extensions=3.2s ui-defaults=0.1s launch=0.3s app-started=0.3s checkpoint=0.4s) 100%|██████████████████████████████████████████████████████████████████████████████████| 40/40 [00:10<00:00, 3.70it/s] 40%|████████████████████████████████▊ | 16/40 [00:24<00:50, 2.11s/it]12:05:35-699059 ERROR API error: POST: http://127.0.0.1:7860/internal/progress {'error': 'OutOfMemoryError', 'detail': '', 'body': '', 'errors': 'CUDA out of memory. Tried to allocate 2.25 GiB (GPU 0; 8.00 GiB total capacity; 6.17 GiB already allocated; 691.00 MiB free; 6.20 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF'} 12:05:35-700706 ERROR HTTP API: OutOfMemoryError ╭───────────────────────────────────────── Traceback (most recent call last) ──────────────────────────────────────────╮│ D:\Programmes 2\Logiciels\Vlad\automatic\venv\lib\site-packages\anyio\streams\memory.py:98 in receive ││ ││ D:\Programmes 2\Logiciels\Vlad\automatic\venv\lib\site-packages\anyio\streams\memory.py:93 in receive_nowait │╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯WouldBlock

During handling of the above exception, another exception occurred:

╭───────────────────────────────────────── Traceback (most recent call last) ──────────────────────────────────────────╮│ D:\Programmes 2\Logiciels\Vlad\automatic\venv\lib\site-packages\starlette\middleware\base.py:78 in call_next ││ ││ D:\Programmes 2\Logiciels\Vlad\automatic\venv\lib\site-packages\anyio\streams\memory.py:118 in receive │╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯EndOfStream

During handling of the above exception, another exception occurred:

╭───────────────────────────────────────── Traceback (most recent call last) ──────────────────────────────────────────╮│ D:\Programmes 2\Logiciels\Vlad\automatic\modules\middleware.py:41 in log_and_time ││ ││ 40 │ │ │ ts = time.time() ││ ❱ 41 │ │ │ res: Response = await call_next(req) ││ 42 │ │ │ duration = str(round(time.time() - ts, 4)) ││ ││ D:\Programmes 2\Logiciels\Vlad\automatic\venv\lib\site-packages\starlette\middleware\base.py:84 in call_next ││ ││ ... 33 frames hidden ... ││ ││ D:\Programmes 2\Logiciels\Vlad\automatic\venv\lib\site-packages\torch\nn\modules\normalization.py:273 in forward ││ ││ 272 │ def forward(self, input: Tensor) -> Tensor: ││ ❱ 273 │ │ return F.group_norm( ││ 274 │ │ │ input, self.num_groups, self.weight, self.bias, self.eps) ││ ││ D:\Programmes 2\Logiciels\Vlad\automatic\venv\lib\site-packages\torch\nn\functional.py:2530 in group_norm ││ ││ 2529 │ _verify_batch_size([input.size(0) * input.size(1) // num_groups, num_groups] + list( ││ ❱ 2530 │ return torch.group_norm(input, num_groups, weight, bias, eps, torch.backends.cudnn.e ││ 2531 │╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯OutOfMemoryError: CUDA out of memory. Tried to allocate 2.25 GiB (GPU 0; 8.00 GiB total capacity; 6.17 GiB already allocated; 691.00 MiB free; 6.20 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try settingmax_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF 100%|██████████████████████████████████████████████████████████████████████████████████| 40/40 [00:57<00:00, 1.43s/it] 100%|██████████████████████████████████████████████████████████████████████████████████| 40/40 [00:11<00:00, 3.51it/s] 2%|█▋ | 2/100 [00:01<01:34, 1.03it/s]12:07:13-653882 ERROR API error: POST: http://127.0.0.1:7860/internal/progress {'error': 'OutOfMemoryError', 'detail': '', 'body': '', 'errors': 'CUDA out of memory. Tried to allocate 2.25 GiB (GPU 0; 8.00 GiB total capacity; 5.70 GiB already allocated; 1.13 GiB free; 5.75 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF'} 12:07:13-656486 ERROR HTTP API: OutOfMemoryError ╭───────────────────────────────────────── Traceback (most recent call last) ──────────────────────────────────────────╮│ D:\Programmes 2\Logiciels\Vlad\automatic\venv\lib\site-packages\anyio\streams\memory.py:98 in receive ││ ││ D:\Programmes 2\Logiciels\Vlad\automatic\venv\lib\site-packages\anyio\streams\memory.py:93 in receive_nowait │╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯WouldBlock

During handling of the above exception, another exception occurred:

╭───────────────────────────────────────── Traceback (most recent call last) ──────────────────────────────────────────╮│ D:\Programmes 2\Logiciels\Vlad\automatic\venv\lib\site-packages\starlette\middleware\base.py:78 in call_next ││ ││ D:\Programmes 2\Logiciels\Vlad\automatic\venv\lib\site-packages\anyio\streams\memory.py:118 in receive │╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯EndOfStream

During handling of the above exception, another exception occurred:

╭───────────────────────────────────────── Traceback (most recent call last) ──────────────────────────────────────────╮│ D:\Programmes 2\Logiciels\Vlad\automatic\modules\middleware.py:41 in log_and_time ││ ││ 40 │ │ │ ts = time.time() ││ ❱ 41 │ │ │ res: Response = await call_next(req) ││ 42 │ │ │ duration = str(round(time.time() - ts, 4)) ││ ││ D:\Programmes 2\Logiciels\Vlad\automatic\venv\lib\site-packages\starlette\middleware\base.py:84 in call_next ││ ││ ... 33 frames hidden ... ││ ││ D:\Programmes 2\Logiciels\Vlad\automatic\venv\lib\site-packages\torch\nn\modules\normalization.py:273 in forward ││ ││ 272 │ def forward(self, input: Tensor) -> Tensor: ││ ❱ 273 │ │ return F.group_norm( ││ 274 │ │ │ input, self.num_groups, self.weight, self.bias, self.eps) ││ ││ D:\Programmes 2\Logiciels\Vlad\automatic\venv\lib\site-packages\torch\nn\functional.py:2530 in group_norm ││ ││ 2529 │ _verify_batch_size([input.size(0) * input.size(1) // num_groups, num_groups] + list( ││ ❱ 2530 │ return torch.group_norm(input, num_groups, weight, bias, eps, torch.backends.cudnn.e ││ 2531 │╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯OutOfMemoryError: CUDA out of memory. Tried to allocate 2.25 GiB (GPU 0; 8.00 GiB total capacity; 5.70 GiB already allocated; 1.13 GiB free; 5.75 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF 100%|████████████████████████████████████████████████████████████████████████████████| 100/100 [01:27<00:00, 1.14it/s] 12:08:39-171451 WARNING GPU high memory utilization: 97% {'ram': {'used': 3.89, 'total': 31.88}, 'gpu': {'used': 7.73, 'total': 8.0}, 'retries': 19, 'oom': 2}

Loading weights: D:\Programmes 2\Logiciels\stable-diffusion-webui\models\Lora\Roji_MiaM3.safetensors ━━━━━━━━ 0.0/9.6 -:--:-- MB Loading weights: D:\Programmes 2\Logiciels\stable-diffusion-webui\models\Lora\epiNoiseoffset_v2.safetensors ━━━━━ 0.0/8… -:--:… MB Loading weights: D:\Programmes 2\Logiciels\stable-diffusion-webui\models\Lora\add_detail.safetensors ━━━━━━━━━ 0.0/37.9 -:--:-- MB 100%|█████████████████████████████████████████████████████████████████████████████████████████| 70/70 [00:46<00:00, 1.52it/s] 12:17:18-223274 WARNING GPU high memory utilization: 100% {'ram': {'used': 4.07, 'total': 31.88}, 'gpu': {'used': 8.0, 'total': 8.0}, 'retries': 20, 'oom': 2} 1%|▉ | 1/100 [00:03<05:27, 3.31s/it]12:17:24-424713 ERROR API error: POST: http://127.0.0.1:7860/internal/progress {'error': 'OutOfMemoryError', 'detail': '', 'body': '', 'errors': 'CUDA out of memory. Tried to allocate 2.25 GiB (GPU 0; 8.00 GiB total capacity; 5.58 GiB already allocated; 1.24 GiB free; 5.63 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF'} 12:17:24-426710 ERROR HTTP API: OutOfMemoryError ╭───────────────────────────────────────────── Traceback (most recent call last) ─────────────────────────────────────────────╮ │ D:\Programmes 2\Logiciels\Vlad\automatic\venv\lib\site-packages\anyio\streams\memory.py:98 in receive │ │ │ │ D:\Programmes 2\Logiciels\Vlad\automatic\venv\lib\site-packages\anyio\streams\memory.py:93 in receive_nowait │ ╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯ WouldBlock

During handling of the above exception, another exception occurred:

╭───────────────────────────────────────────── Traceback (most recent call last) ─────────────────────────────────────────────╮ │ D:\Programmes 2\Logiciels\Vlad\automatic\venv\lib\site-packages\starlette\middleware\base.py:78 in call_next │ │ │ │ D:\Programmes 2\Logiciels\Vlad\automatic\venv\lib\site-packages\anyio\streams\memory.py:118 in receive │ ╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯ EndOfStream

During handling of the above exception, another exception occurred:

╭───────────────────────────────────────────── Traceback (most recent call last) ─────────────────────────────────────────────╮ │ D:\Programmes 2\Logiciels\Vlad\automatic\modules\middleware.py:41 in log_and_time │ │ │ │ 40 │ │ │ ts = time.time() │ │ ❱ 41 │ │ │ res: Response = await call_next(req) │ │ 42 │ │ │ duration = str(round(time.time() - ts, 4)) │ │ │ │ D:\Programmes 2\Logiciels\Vlad\automatic\venv\lib\site-packages\starlette\middleware\base.py:84 in call_next │ │ │ │ ... 33 frames hidden ... │ │ │ │ D:\Programmes 2\Logiciels\Vlad\automatic\venv\lib\site-packages\torch\nn\modules\normalization.py:273 in forward │ │ │ │ 272 │ def forward(self, input: Tensor) -> Tensor: │ │ ❱ 273 │ │ return F.group_norm( │ │ 274 │ │ │ input, self.num_groups, self.weight, self.bias, self.eps) │ │ │ │ D:\Programmes 2\Logiciels\Vlad\automatic\venv\lib\site-packages\torch\nn\functional.py:2530 in group_norm │ │ │ │ 2529 │ _verify_batch_size([input.size(0) * input.size(1) // num_groups, num_groups] + list( │ │ ❱ 2530 │ return torch.group_norm(input, num_groups, weight, bias, eps, torch.backends.cudnn.e │ │ 2531 │ ╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯ OutOfMemoryError: CUDA out of memory. Tried to allocate 2.25 GiB (GPU 0; 8.00 GiB total capacity; 5.58 GiB already allocated; 1.24 GiB free; 5.63 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF 100%|███████████████████████████████████████████████████████████████████████████████████████| 100/100 [03:03<00:00, 1.83s/it] `

brknsoul commented 1 year ago

This means you're trying generate an image that too large for your GPUs ram. Stick to 512x512 or 512x768 (or vice versa), OR your Hires Fix scaling is too high.

Rojinski commented 1 year ago

Once again, thanks for your help. And i can assure you that in Automatic 1111, it was working (yes, a bit slower than usual indeed) without problem with the last drivers... :/

But until now, i never had the problem with the same size generation. So, it's strange that now, i have to reduce the quality... It happend after the last upfdate of Vlad.

brknsoul commented 1 year ago

What COMMANDLINE_ARGS were you using with A1111?

Rojinski commented 1 year ago

--xformers, same in Vlad (selected in the settings). Here's the A1111 console.

locon load lora method0:00, ?it/s] locon load lora method locon load lora method 100%|██████████████████████████████████████████████████████████████████████████████████| 70/70 [00:32<00:00, 2.14it/s] 63%|███████████████████████████████████████████████████ | 63/100 [02:52<01:43, 2.80s/it] Total progress: 78%|██████████████████████████████████████████████████ | 133/170 [03:40<01:43, 2.80s/it]

Rojinski commented 1 year ago

Here's the complete console log

`remote: Enumerating objects: 9, done. remote: Counting objects: 100% (9/9), done. remote: Compressing objects: 100% (7/7), done. remote: Total 9 (delta 2), reused 6 (delta 2), pack-reused 0 Unpacking objects: 100% (9/9), 16.08 KiB | 658.00 KiB/s, done. From https://github.com/AUTOMATIC1111/stable-diffusion-webui 8d0078b6..f865d3e1 master -> origin/master 910d4f61..7b833291 dev -> origin/dev dbc88c96..f865d3e1 release_candidate -> origin/release_candidate

Checking roop requirements Install insightface==0.7.3 Installing sd-webui-roop requirement: insightface==0.7.3 Install onnx==1.14.0 Installing sd-webui-roop requirement: onnx==1.14.0 Install onnxruntime==1.15.0 Installing sd-webui-roop requirement: onnxruntime==1.15.0 Install opencv-python==4.7.0.72 Installing sd-webui-roop requirement: opencv-python==4.7.0.72

Launching Web UI with arguments: --disable-safe-unpickle --xformers Additional Network extension not installed, Only hijack built-in lora LoCon Extension hijack built-in lora successfully [-] ADetailer initialized. version: 23.7.5, num models: 9 [AddNet] Updating model hashes... 0it [00:00, ?it/s] [AddNet] Updating model hashes... 0it [00:00, ?it/s] 2023-07-11 12:31:24,570 - ControlNet - INFO - ControlNet v1.1.231 ControlNet preprocessor location: D:\Programmes 2\Logiciels\stable-diffusion-webui\extensions\sd-webui-controlnet\annotator\downloads 2023-07-11 12:31:24,745 - ControlNet - INFO - ControlNet v1.1.231 2023-07-11 12:31:25,759 - roop - INFO - roop v0.0.2 2023-07-11 12:31:25,760 - roop - INFO - roop v0.0.2 Loading weights [d520ddee8f] from D:\Programmes 2\Logiciels\stable-diffusion-webui\models\Stable-diffusion\beautifulArt_v30.safetensors Creating model from config: D:\Programmes 2\Logiciels\stable-diffusion-webui\configs\v1-inference.yaml LatentDiffusion: Running in eps-prediction mode DiffusionWrapper has 859.52 M params. " SadTalker will not support download all the files from hugging face, which will take a long time.

        please manually set the SADTALKER_CHECKPOINTS in `webui_user.bat`(windows) or `webui_user.sh`(linux)

Loading VAE weights specified in settings: D:\Programmes 2\Logiciels\stable-diffusion-webui\models\VAE\vae-ft-mse-840000-ema-pruned.safetensors Running on local URL: http://127.0.0.1:7860

To create a public link, set share=True in launch(). Startup time: 23.6s (import torch: 3.6s, import gradio: 2.0s, import ldm: 0.9s, other imports: 2.2s, setup codeformer: 0.1s, list SD models: 0.8s, load scripts: 10.9s, scripts before_ui_callback: 0.3s, create ui: 2.5s, gradio launch: 0.2s). preload_extensions_git_metadata for 26 extensions took 1.74s Applying attention optimization: xformers... done.

Model loaded in 5.4s (load weights from disk: 0.5s, create model: 0.8s, apply weights to model: 0.5s, apply half(): 0.5s, load VAE: 0.1s, move model to device: 2.7s, load textual inversion embeddings: 0.2s, calculate empty prompt: 0.1s). locon load lora method locon load lora method0:00, ?it/s] locon load lora method locon load lora method 100%|██████████████████████████████████████████████████████████████████████████████████| 70/70 [00:32<00:00, 2.14it/s] 100%|████████████████████████████████████████████████████████████████████████████████| 100/100 [04:32<00:00, 2.72s/it] Total progress: 100%|████████████████████████████████████████████████████████████████| 170/170 [05:21<00:00, 2.82s/it] `

brknsoul commented 1 year ago

With a 3070 GPU, you should not be using xformers, but the next one across (Scale Dot Product, iirc).

In the future, for error messages, place three back ticks at the start and end of the error message like this; ``` ERROR MESSAGES ```

So they appear like this

ERROR
MESSAGES
Rojinski commented 1 year ago

OK. Thank you once again. :) I always used xformers and i never had problems.... So, why now?

brknsoul commented 1 year ago

xformers is faster for lower end nvidia GPUs, SDP is faster for higher end GPUs.

Updates get push quite a bit for SD.Next. Vlad tries to keep this on the bleeding edge of what available for Stable Diffusion, so occasionally hiccups happen.

As for your Live Preview issue, you could adjust the Live Preview Method in Settings > Live Preview to Approximate NN or Approximate simple, although disabling it does generate images slightly faster

Rojinski commented 1 year ago

OK, thanks a lot for the information!

Rojinski commented 1 year ago

With a 3070 GPU, you should not be using xformers, but the next one across (Scale Dot Product, iirc).

In the future, for error messages, place three back ticks at the start and end of the error message like this; ERROR MESSAGES

So they appear like this

ERROR
MESSAGES

I've tried to swap from xformers to SDP... and nothing is working anymore.... :(

Using VENV: D:\Programmes 2\Logiciels\Vlad\automatic\venv
12:52:01-473106 INFO     Starting SD.Next
12:52:01-479106 INFO     Python 3.10.10 on Windows
12:52:01-514630 INFO     Version: 75a8c1f9 Mon Jul 10 11:44:52 2023 -0400
12:52:01-918023 INFO     Latest published version: a844a83d9daa9987295932c0db391ec7be5f2d32 2023-07-11T08:00:45Z
12:52:01-924007 INFO     nVidia CUDA toolkit detected
12:52:03-046623 INFO     Torch 2.0.1+cu118
12:52:03-055656 INFO     Torch backend: nVidia CUDA 11.8 cuDNN 8700
12:52:03-057295 INFO     Torch detected GPU: NVIDIA GeForce RTX 3070 Ti VRAM 8192 Arch (8, 6) Cores 48
12:52:03-058265 WARNING  Not used, uninstalling: xformers 0.0.20
12:52:03-830438 INFO     Enabled extensions-builtin: ['a1111-sd-webui-lycoris', 'clip-interrogator-ext', 'LDSR', 'Lora',
                         'multidiffusion-upscaler-for-automatic1111', 'ScuNET', 'sd-dynamic-thresholding',
                         'sd-extension-aesthetic-scorer', 'sd-extension-steps-animation', 'sd-extension-system-info',
                         'sd-webui-agent-scheduler', 'sd-webui-controlnet', 'sd-webui-model-converter', 'seed_travel',
                         'stable-diffusion-webui-images-browser', 'stable-diffusion-webui-rembg', 'SwinIR']
12:52:03-835464 INFO     Enabled extensions: ['adetailer', 'Config-Presets', 'model-keyword', 'openpose-editor',
                         'SD-CN-Animation', 'sd-dynamic-prompts', 'sd-webui-3d-open-pose-editor',
                         'sd-webui-additional-networks', 'sd-webui-aspect-ratio-helper', 'sd-webui-roop',
                         'sd_dreambooth_extension', 'ultimate-upscale-for-automatic1111']
12:52:03-839430 INFO     No changes detected: Quick launch active
12:52:03-846899 INFO     Extension preload: 0.0s D:\Programmes 2\Logiciels\Vlad\automatic\extensions-builtin
12:52:03-848897 INFO     Extension preload: 0.0s D:\Programmes 2\Logiciels\Vlad\automatic\extensions
12:52:03-858898 INFO     Server arguments: []
12:52:07-639542 INFO     Pipeline: Backend.ORIGINAL
No module 'xformers'. Proceeding without it.
12:52:08-514862 INFO     Libraries loaded
12:52:08-516869 INFO     Using data path: D:\Programmes 2\Logiciels\Vlad\automatic
12:52:08-524901 INFO     Available VAEs: D:\Programmes 2\Logiciels\stable-diffusion-webui\models\VAE 16
12:52:08-652941 INFO     Available models: D:\Programmes 2\Logiciels\stable-diffusion-webui\models\Stable-diffusion 305
12:52:10-611215 INFO     ControlNet v1.1.232
ControlNet v1.1.232
ControlNet preprocessor location: D:\Programmes 2\Logiciels\Vlad\automatic\extensions-builtin\sd-webui-controlnet\annotator\downloads
12:52:10-743719 INFO     ControlNet v1.1.232
ControlNet v1.1.232
Image Browser: ImageReward is not installed, cannot be used.
[-] ADetailer initialized. version: 23.7.5, num models: 9
12:52:12-667746 INFO     Libraries loaded
Libraries loaded
[AddNet] Updating model hashes...
0it [00:00, ?it/s]
[AddNet] Updating model hashes...
0it [00:00, ?it/s]
2023-07-11 12:52:13,167 - roop - INFO - roop v0.0.2
2023-07-11 12:52:13,167 - roop - INFO - roop v0.0.2
Loading weights: D:\Programmes 2\Logiciels\stable-diffusion-webui\models\Stable-diffusion\beautifulArt_v30.safetensors
12:52:13-461875 INFO     Setting Torch parameters: dtype=torch.float16 vae=torch.float16 unet=torch.float16
Setting Torch parameters: dtype=torch.float16 vae=torch.float16 unet=torch.float16
LatentDiffusion: Running in eps-prediction mode
DiffusionWrapper has 859.52 M params.
Loading weights: D:\Programmes 2\Logiciels\stable-diffusion-webui\models\VAE\vae-ft-mse-840000-ema-pruned.safetensors
12:52:15-117038 INFO     Applying scaled dot product cross attention optimization
Applying scaled dot product cross attention optimization
12:52:15-276314 INFO     Embeddings: loaded=249 skipped=29
Embeddings: loaded=249 skipped=29
12:52:15-281551 INFO     Model loaded in 2.0s (load=0.1s create=0.3s apply=0.4s vae=0.5s move=0.5s embeddings=0.2s)
Model loaded in 2.0s (load=0.1s create=0.3s apply=0.4s vae=0.5s move=0.5s embeddings=0.2s)
12:52:15-420500 INFO     Model load finished: {'ram': {'used': 3.22, 'total': 31.88}, 'gpu': {'used': 3.13, 'total':
                         8.0}, 'retries': 0, 'oom': 0} cached=0
Model load finished: {'ram': {'used': 3.22, 'total': 31.88}, 'gpu': {'used': 3.13, 'total': 8.0}, 'retries': 0, 'oom': 0} cached=0
12:52:16-028433 INFO     Loading UI theme: name=gradio/default style=Auto
12:52:17-548383 INFO     Available models: D:\Programmes 2\Logiciels\stable-diffusion-webui\models\Stable-diffusion 305
CUDA SETUP: Loading binary D:\Programmes 2\Logiciels\Vlad\automatic\venv\lib\site-packages\bitsandbytes\libbitsandbytes_cudaall.dll...
Running on local URL:  http://127.0.0.1:7860
12:52:20-196633 INFO     Local URL: http://127.0.0.1:7860/
12:52:20-197638 INFO     Initializing middleware
12:52:20-346488 INFO     [AgentScheduler] Task queue is empty
12:52:20-347011 INFO     [AgentScheduler] Registering APIs
12:52:20-874343 INFO     Startup time: 17.0s (torch=2.7s gradio=0.8s libraries=1.2s models=0.1s scripts=7.2s
                         onchange=0.1s ui-txt2img=0.7s ui-img2img=0.1s ui-settings=0.1s ui-extensions=2.9s
                         ui-defaults=0.1s launch=0.3s app-started=0.3s checkpoint=0.4s)
Loading weights: D:\Programmes 2\Logiciels\stable-diffusion-webui\models\Lora\epiNoiseoffset_v2-pynoise.safetensors   -…
Loading weights: D:\Programmes 2\Logiciels\stable-diffusion-webui\models\Lora\add_detail.safetensors ━━━━━ 0.0/3… -:--:…
                                                                                                           MB
100%|██████████████████████████████████████████████████████████████████████████████████| 50/50 [00:32<00:00,  1.55it/s]
12:54:10-878543 ERROR    Exception: CUDA out of memory. Tried to allocate 2.53 GiB (GPU 0; 8.00 GiB total capacity; 5.24
                         GiB already allocated; 1.51 GiB free; 5.36 GiB reserved in total by PyTorch) If reserved memory
                         is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation
                         for Memory Management and PYTORCH_CUDA_ALLOC_CONF
12:54:10-880534 ERROR    Arguments: args=('task(zoufdtjx09rttst)', '<lora:epiNoiseoffset_v2-pynoise:1>, masterpiece,
                         best quality, dark art, sinister old woman naked, artistic vision of peace in death, night,
                         absurdes, intricate, surrealism, dramatic lighting, epitaph, tomb, art by Agostino Arrivabene,
                         by Alois Arnegger, by Bastien Lecouffe-Deharme, by Karol Bak, by Beksinski, by Agnes Cecile,
                         8k, uhd  <lora:add_detail:1>', 'bad-image-v2-39000, EasyNegative, NG_DeepNegative_V1_75T,
                         bad-hands-5,   badPromptVersion2_v10, NG_DeepNegative_V1_75T, rmadanegative4_sd15-neg, red
                         eyes, orange eyes, yellow eyes,  photozoov15,  SkinDetailNeg-neg,   difConsistency_negative',
                         [], 50, 10, True, False, 1, 1, 7, 1, 483393612.0, -1.0, 0, 0, 0, False, 768, 768, True, 0.01,
                         2, '4x-UltraSharp', 100, 0, 0, [], 0, False, 'MultiDiffusion', False, True, 1024, 1024, 96, 96,
                         48, 4, 'None', 2, False, 10, 1, 1, 64, False, False, False, False, False, 0.4, 0.4, 0.2, 0.2,
                         '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0,
                         False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '',
                         'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False,
                         0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '',
                         'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False,
                         1536, 96, True, True, True, False, False, 7, 100, 'Constant', 0, 'Constant', 0, 4, False,
                         'x264', 'blend', 10, 0, 0, False, True, True, True, 'intermediate', 'animation',
                         <scripts.controlnet_ui.controlnet_ui_group.UiControlNetUnit object at 0x0000014A05B99300>,
                         False, {'ad_model': 'face_yolov8n.pt', 'ad_prompt': '', 'ad_negative_prompt': '',
                         'ad_confidence': 0.3, 'ad_mask_min_ratio': 0, 'ad_mask_max_ratio': 1, 'ad_x_offset': 0,
                         'ad_y_offset': 0, 'ad_dilate_erode': 32, 'ad_mask_merge_invert': 'None', 'ad_mask_blur': 4,
                         'ad_denoising_strength': 0.4, 'ad_inpaint_only_masked': True, 'ad_inpaint_only_masked_padding':
                         32, 'ad_use_inpaint_width_height': False, 'ad_inpaint_width': 512, 'ad_inpaint_height': 512,
                         'ad_use_steps': False, 'ad_steps': 28, 'ad_use_cfg_scale': False, 'ad_cfg_scale': 7,
                         'ad_use_noise_multiplier': False, 'ad_noise_multiplier': 1, 'ad_restore_face': False,
                         'ad_controlnet_model': 'None', 'ad_controlnet_module': 'inpaint_global_harmonious',
                         'ad_controlnet_weight': 1, 'ad_controlnet_guidance_start': 0, 'ad_controlnet_guidance_end': 1,
                         'is_api': ()}, {'ad_model': 'None', 'ad_prompt': '', 'ad_negative_prompt': '', 'ad_confidence':
                         0.3, 'ad_mask_min_ratio': 0, 'ad_mask_max_ratio': 1, 'ad_x_offset': 0, 'ad_y_offset': 0,
                         'ad_dilate_erode': 32, 'ad_mask_merge_invert': 'None', 'ad_mask_blur': 4,
                         'ad_denoising_strength': 0.4, 'ad_inpaint_only_masked': True, 'ad_inpaint_only_masked_padding':
                         32, 'ad_use_inpaint_width_height': False, 'ad_inpaint_width': 512, 'ad_inpaint_height': 512,
                         'ad_use_steps': False, 'ad_steps': 28, 'ad_use_cfg_scale': False, 'ad_cfg_scale': 7,
                         'ad_use_noise_multiplier': False, 'ad_noise_multiplier': 1, 'ad_restore_face': False,
                         'ad_controlnet_model': 'None', 'ad_controlnet_module': 'inpaint_global_harmonious',
                         'ad_controlnet_weight': 1, 'ad_controlnet_guidance_start': 0, 'ad_controlnet_guidance_end': 1,
                         'is_api': ()}, False, 'keyword, prompt', 'keyword1', 'None', 'model keyword first', 'None',
                         '1', 'None', True, False, 1, False, False, False, 1.1, 1.5, 100, 0.7, False, False, True,
                         False, False, 0, 'Gustavosta/MagicPrompt-Stable-Diffusion', '', False, False, 'LoRA', 'None',
                         0, 0, 'LoRA', 'None', 0, 0, 'LoRA', 'None', 0, 0, 'LoRA', 'None', 0, 0, 'LoRA', 'None', 0, 0,
                         None, 'Refresh models', None, False, '0', 'D:\\Programmes
                         2\\Logiciels\\Vlad\\automatic\\models\\roop\\inswapper_128.onnx', 'CodeFormer', 1, '', 1, 1,
                         False, True, False, False, 'positive', 'comma', 0, False, False, '', 7, '', [], 0, '', [], 0,
                         '', [], True, False, False, False, 0, False, None, None, False, 50, False, 4.0, '', 10.0,
                         'Linear', 3, False, 30.0, True, False, False, 0, 0.0, 'Lanczos', 1, True, 0, 0, 0.001, 75, 0.0,
                         False, True) kwargs={}
12:54:10-889575 ERROR    gradio call: OutOfMemoryError
╭───────────────────────────────────────── Traceback (most recent call last) ──────────────────────────────────────────╮
│ D:\Programmes 2\Logiciels\Vlad\automatic\modules\call_queue.py:34 in f                                               │
│                                                                                                                      │
│    33 │   │   │   try:                                                                                               │
│ ❱  34 │   │   │   │   res = func(*args, **kwargs)                                                                    │
│    35 │   │   │   │   progress.record_results(id_task, res)                                                          │
│                                                                                                                      │
│ D:\Programmes 2\Logiciels\Vlad\automatic\modules\txt2img.py:56 in txt2img                                            │
│                                                                                                                      │
│   55 │   if processed is None:                                                                                       │
│ ❱ 56 │   │   processed = processing.process_images(p)                                                                │
│   57 │   p.close()                                                                                                   │
│                                                                                                                      │
│                                               ... 11 frames hidden ...                                               │
│                                                                                                                      │
│ D:\Programmes 2\Logiciels\Vlad\automatic\venv\lib\site-packages\torch\nn\modules\module.py:1501 in _call_impl        │
│                                                                                                                      │
│   1500 │   │   │   │   or _global_forward_hooks or _global_forward_pre_hooks):                                       │
│ ❱ 1501 │   │   │   return forward_call(*args, **kwargs)                                                              │
│   1502 │   │   # Do not call functions when jit is used                                                              │
│                                                                                                                      │
│ D:\Programmes 2\Logiciels\Vlad\automatic\modules\sd_hijack_optimizations.py:515 in sdp_attnblock_forward             │
│                                                                                                                      │
│   514 │   v = v.contiguous()                                                                                         │
│ ❱ 515 │   out = torch.nn.functional.scaled_dot_product_attention(q, k, v, dropout_p=0.0, is_ca                       │
│   516 │   out = out.to(dtype)                                                                                        │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
OutOfMemoryError: CUDA out of memory. Tried to allocate 2.53 GiB (GPU 0; 8.00 GiB total capacity; 5.24 GiB already
allocated; 1.51 GiB free; 5.36 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting
max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
12:55:00-321795 INFO     Settings changed: 1 ['cross_attention_optimization']
12:55:09-445454 INFO     Server shutdown requested
Traceback (most recent call last):
  File "D:\Programmes 2\Logiciels\Vlad\automatic\venv\lib\site-packages\gradio\routes.py", line 422, in run_predict
    output = await app.get_blocks().process_api(
  File "D:\Programmes 2\Logiciels\Vlad\automatic\venv\lib\site-packages\gradio\blocks.py", line 1323, in process_api
    result = await self.call_function(
  File "D:\Programmes 2\Logiciels\Vlad\automatic\venv\lib\site-packages\gradio\blocks.py", line 1051, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "D:\Programmes 2\Logiciels\Vlad\automatic\venv\lib\site-packages\anyio\to_thread.py", line 33, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "D:\Programmes 2\Logiciels\Vlad\automatic\venv\lib\site-packages\anyio\_backends\_asyncio.py", line 877, in run_sync_in_worker_thread
    return await future
asyncio.exceptions.CancelledError
brknsoul commented 1 year ago

I guess you could go back to using xformers if it helps.

I'm not well versed on nvidia GPUs.

Rather than chatting back and forth here, suggest you use the SD.Next discord.

https://github.com/vladmandic/automatic/discussions/1059

vladmandic commented 1 year ago

you said problems started after you upgraded - from which version? there have been no changes in the core workflow in quite a while that would impact this.

and yes, you can use either sdp or xformers, whichever works better for you. one thing to try to eliminate is to temporarily disable some user extensions as they are known to change versions of required packages to older versions.

regarding nvidia driver, there is a long thread on that and nvidia itself admited to a problem with newer versions and memory handling, so its not just for sdnext.

and did you try with --medvram?

Rojinski commented 1 year ago

you said problems started after you upgraded - from which version? there have been no changes in the core workflow in quite a while that would impact this.

and yes, you can use either sdp or xformers, whichever works better for you. one thing to try to eliminate is to temporarily disable some user extensions as they are known to change versions of required packages to older versions.

regarding nvidia driver, there is a long thread on that and nvidia itself admited to a problem with newer versions and memory handling, so its not just for sdnext.

and did you try with --medvram?

Hello :) thanks a lot for your help.

I've updated SD.next from the last version before this one (i keep it up-to-date and --upgrade or "git pull" in the cmd once every two days).

I've reinstalled the last nvidia driver because with 531v, it was a mess... and I get a bunch of errors on sdnext and even on Automatic 1111 (CUDA out of memory). So, I've reinstalled it all and it works fine on Automatic 1111, i promiss.

I've never used --medvram before and, once again, it was working fine like that on sdnext.

I forgot to mention one thing: after upgrading sdnext, I've tried the tuto of Sebastian Kamph about using sdnext for SDXL with using --backend diffusers Was not working for me. I couldn't load the models from the UI as indicated.

Until then, sdnext was perfect using xformers, no --medvram.... It was even a bit faster than A1111.

The only "strange" extension I have is SD-CN. And it's working well and fast (1 hour for 720 frames at 1024x576)... Maybe i should make a clean install of sdnext again? Anyway, thanks a lot for all your work and sdnext is brilliant.

vladmandic commented 1 year ago

I've reinstalled the last nvidia driver because with 531v, it was a mess... and I get a bunch of errors on sdnext and even on Automatic 1111 (CUDA out of memory). So, I've reinstalled it all and it works fine on Automatic 1111, i promiss.

This statement is misleading at best - you need to know what exactly is the issue with nvidia drivers. Read thread on that. If 531 is reporting OOM, then its real OOM, its not "a mess". Newer drivers start swapping VRAM to RAM causing massive slowdowns. So you either live with slowdown or live with OOM. And this is a very short version of the story, but I cannot go over details on something that was covered in 50+ posts. And nVidia acknowledged the issue.

it works fine on Automatic 1111, i promiss

Tiny differences matter when swapping will start and if you setup SDNext exactly the same A1111, it would be exactly the same. But defaults for a lot of settings are different, so slowdown is triggered at different levels.

I'm sorry, but thats the reality.

And there were NO changes in the last 10 days that would have any impact on this. But some packages got upgraded which probably means they use 1-2MB more and thats the trigger. You're like running on the edge of maximum resolution and this is expected behavior.

Rojinski commented 1 year ago

Thank you for your answer annd I'm sorry if i've upset you by my "newbie" question.... It was not my purpose to criticize sdnext (or you!!) in a bad way. I'm just a sound engineer :) I understand that there's an issue with Nvidia driver and nobody can't do something about that actually. So, once again, I am sorry to ask questions that are, for you and your pairs, silly. Keep up the good job. And thanks for your kindness.