Closed shimizu-izumi closed 1 year ago
when did you last update webui? This maybe from a windows update. you may want to disable browser hardware acceleration. I've found openoutpaint extension automatically uses some vram with browser hardware acceleration
Same issue here, for a simple 5.x5 i cant even use with the normal sd 2.1 model or any upscale. That happened with the new update today. :/
when did you last update webui? This maybe from a windows update. you may want to disable browser hardware acceleration. I've found openoutpaint extension automatically uses some vram with browser hardware acceleration
I updated the WebUI around 2 PM UTC+1. The last major Windows update was a few weeks ago. When I used the WebUI a few days ago, everything still worked without any errors, and I don't have the openoutpaint extension.
I made a fresh install right now with a RTX4090. Running out of VRAM constantly, never happened before.
RuntimeError: CUDA out of memory. Tried to allocate 4.00 GiB (GPU 0; 23.99 GiB total capacity; 12.81 GiB already allocated; 0 bytes free; 21.74 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.
Denoising strength: 0.69, Clip skip: 2, Hires upscale: 2, Hires upscaler: R-ESRGAN AnimeVideo
I might be mistaken because you, but I think the culprit is the new Highres fix. It upscales the images before processing them for the second time and they may be too big to fit into your VRAM. I see a lot of people complaining about how confusing it to use and how it gives inferior results. In my experience as well it is of a questionable usability right now.
If you really need to use the Highres fix now, try setting the upscaling factor to 1. It somehow makes it behave, even though its counter-intuitive, and the default setting is 2. Here are some examples I got: Default settings (upscale by 2): Upscale by 1:
On the other hand, I just noticed that you have a lot of ram, so it makes me think I'm completely wrong about my assumption, and there is something else entirely going on. I'm going to try and use your settings with the same model and see what I get on 8 gb.
Here's the result I got: `masterpiece, best quality, 1girl, brown hair, green eyes, colorful, autumn, cumulonimbus clouds, lighting, blue sky, falling leaves, garden Negative prompt: lowres, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, blurry, artist name, Steps: 50, Sampler: Euler a, CFG scale: 7, Seed: 3607441108, Size: 512x768, Model: Anything-V3.0-pruned-fp32, Denoising strength: 0.69, Clip skip: 2, Hires upscale: 2, Hires upscaler: R-ESRGAN 4x+ Anime6B
Time taken: 4m 49.25sTorch active/reserved: 4777/6598 MiB, Sys VRAM: 8192/8192 MiB (100.0%)` It used all the available memory, but didn't run out. It also made the image twice the size I ordered and it took me almost 5 minutes on a 1070 ti.
Commit hash: 24d4a0841d3cc0e5908b098f65a9caa3fa889af8
@Alphyn-gunner It's twice the size because of the hires upscale value.
I also noticed that I now get completely different results with the exact same settings.
I made a fresh install right now with a RTX4090. Running out of VRAM constantly, never happened before.
RuntimeError: CUDA out of memory. Tried to allocate 4.00 GiB (GPU 0; 23.99 GiB total capacity; 12.81 GiB already allocated; 0 bytes free; 21.74 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.
Could you post the before and after image size limit?
I also noticed that I now get completely different results with the exact same settings.
Were you using xformers?
I have the same problem, and I don't even use the hi-res fix! I just do normal gen but the VRAM usage is WAYYYYY higher now! I can't do the same batch size that I used to be able to do previously! Everything else is the same, I changed nothing. It only git pulled..
Same issue here, for a simple 5.x5 i cant even use with the normal sd 2.1 model or any upscale. That happened with the new update today.
I honestly thought I was the only one. Generating images is SOO much slower now(And I have a 4090). I really wish there was a way to revert back to the previous update.
I also noticed that I now get completely different results with the exact same settings.
Also getting the same problem. I was wondering why hires was taking so long now so I decided to recreate one of my previous images and I got nothing like it with all the same settings and it took forever.
In the latest versions, hires fix have been modified. Do the 5f4fa942b8ec3ed3b15a352903489d6f9e6eb46e versions also have bugs?
For what it's worth I've also noticed this when training an embedding as of updating today via a fresh install. I have an old version which doesn't have any issues which was how the repository was as of 11/5. I have a lower end card (RTX 2060 6G) so embeddings are all I can do for the moment.
Previously I could train a 512/512 embedding and use the "Read parameters" option on the SD1.4 checkpoint. The message I get states 512mb additional VRAM is needed. For experimentation, I lowered the 512 values and the embedding began to train. However, when it tried to generate an image mid-training, the CUDA memory issue occurred again.
It is worth noting that I'm able to use regular prompts as well as the embedding that was terminated early after running out of memory. So this might be helpful in determining what the cause is.
Same here, as suggested using a less extreme upscale option worked. However, it is considerably slower still. having different highers fix back ends is nice and might yield better results, but why is this the only option? Why not add both?
What is the last known commit that doesn't have this change? I think I'll switch back for that in the time being.
The currently Hires. Fix seems to be tuned much more for higher end cards. It would be very helpful if there was a way to tuned the Hires. Fix to the previous settings, either a direct option or an update to the wiki, for 8GB and lower cards.
For now you could always checkout a previous version:
git checkout fd4461d44c7256d56889f5b5ed9fb660a859172f
This is the one I'm using for the time being as I find the system pretty much unusable as it is now.
Yes, I use xformers. What do you mean by image size limit?
I have the same issue. Found it while using Hi-res fix. I completely understand how to use it, that's not the issue. Now I run out of vram for the same batch sizes/dimensions as before @lolxdmainkaisemaanlu also pointed out the same except they are not even using hi-res. I just happened to notice it on hi-res. Its an independent issue from hi-res fix it seems. reverting fd4461d as well curtousy to @DrGunnarMallon
For now you could always checkout a previous version:
git checkout fd4461d
This is the one I'm using for the time being as I find the system pretty much unusable as it is now.
I'm running A1111 on a 2060 Super, so 8GB of VRAM.
I had a bit of a workflow to do a couple of 512x512 low-level passes, and then bumped it up to 768 to start getting in detail, finally finishing off and upscaling to 1024. I've been doing passes of this process for almost a week (I've been making daily "Twelve Days of Christmas" images).
Even on my older card, it works. Now, even going from 512 to 768 with just 50 steps it just wrecks. I currently cannot render anything at 768x768.
I tried resetting to the hash recommended above, but I'm still going OOM. Is there another hash to recommend reverting to prior to that?
Error completing request
Arguments: (0, 'a photograph of a single red apple, on a yellow plate, on a blue checkered tablecloth.', '', 'None', 'None', <PIL.Image.Image image mode=RGBA size=512x512 at 0x1EFB7F20DF0>, None, None, None, None, 0, 50, 0, 4, 0, 1, False, False, 1, 4, 7, 0.2, 1254105237.0, -1.0, 0, 0, 0, False, 768, 768, 0, False, 32, 0, '', '', 0, '<ul>\n<li><code>CFG Scale</code> should be 2 or lower.</li>\n</ul>\n', True, True, '', '', True, 50, True, 1, 0, False, 4, 1, '<p style="margin-bottom:0.75em">Recommended settings: Sampling Steps: 80-100, Sampler: Euler a, Denoising strength: 0.8</p>', 128, 8, ['left', 'right', 'up', 'down'], 1, 0.05, 128, 4, 0, ['left', 'right', 'up', 'down'], False, None, None, '', '', '', '', 'Auto rename', {'label': 'Upload avatars config'}, 'Open outputs directory', 'Export to WebUI style', True, {'label': 'Presets'}, {'label': 'QC preview'}, '', [], 'Select', 'QC scan', 'Show pics', None, False, False, False, False, '', '<p style="margin-bottom:0.75em">Will upscale the image by the selected scale factor; use width and height sliders to set tile size</p>', 64, 0, 2, 'Positive', 0, ', ', True, 32, 1, '', 0, '', True, False, False) {}
Traceback (most recent call last):
File "G:\GitHub\SDWebUI\modules\call_queue.py", line 45, in f
res = list(func(*args, **kwargs))
File "G:\GitHub\SDWebUI\modules\call_queue.py", line 28, in f
res = func(*args, **kwargs)
File "G:\GitHub\SDWebUI\modules\img2img.py", line 152, in img2img
processed = process_images(p)
File "G:\GitHub\SDWebUI\modules\processing.py", line 471, in process_images
res = process_images_inner(p)
File "G:\GitHub\SDWebUI\modules\processing.py", line 541, in process_images_inner
p.init(p.all_prompts, p.all_seeds, p.all_subseeds)
File "G:\GitHub\SDWebUI\modules\processing.py", line 887, in init
self.init_latent = self.sd_model.get_first_stage_encoding(self.sd_model.encode_first_stage(image))
File "G:\GitHub\SDWebUI\venv\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "G:\GitHub\SDWebUI\repositories\stable-diffusion-stability-ai\ldm\models\diffusion\ddpm.py", line 830, in encode_first_stage
return self.first_stage_model.encode(x)
File "G:\GitHub\SDWebUI\repositories\stable-diffusion-stability-ai\ldm\models\autoencoder.py", line 83, in encode
h = self.encoder(x)
File "G:\GitHub\SDWebUI\venv\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "G:\GitHub\SDWebUI\repositories\stable-diffusion-stability-ai\ldm\modules\diffusionmodules\model.py", line 526, in forward
h = self.down[i_level].block[i_block](hs[-1], temb)
File "G:\GitHub\SDWebUI\venv\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "G:\GitHub\SDWebUI\repositories\stable-diffusion-stability-ai\ldm\modules\diffusionmodules\model.py", line 138, in forward
h = self.norm2(h)
File "G:\GitHub\SDWebUI\venv\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "G:\GitHub\SDWebUI\venv\lib\site-packages\torch\nn\modules\normalization.py", line 272, in forward
return F.group_norm(
File "G:\GitHub\SDWebUI\venv\lib\site-packages\torch\nn\functional.py", line 2516, in group_norm
return torch.group_norm(input, num_groups, weight, bias, eps, torch.backends.cudnn.enabled)
RuntimeError: CUDA out of memory. Tried to allocate 1.12 GiB (GPU 0; 8.00 GiB total capacity; 5.29 GiB already allocated; 0 bytes free; 6.53 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
4af3ca5393151d61363c30eef4965e694eeac15e try that one. the other repo was throwing errors for me as well. Currently back up and running like I was before trying to get the latest build.
4af3ca5 try that one. the other repo was throwing errors for me as well. Currently back up and running like I was before trying to get the latest build.
That one isn't working for me either. Still going OOM.
After bashing git checkout xxxxxx
, is there anything else I need to do other than to close the console and restart?
When you open your auto1111 cmd, it tells you the commit version as soon as you run the webui.bat Does it say Commit hash: 4af3ca5393151d61363c30eef4965e694eeac15e Installing requirements for Web UI...
I restored back to the master branch and, NVidia just put out a driver update.
One of the two affected things, so at least I'm getting things to work better. Memory usage SEEMS better. Still watching it though for a bit.
Did you add git pull
to your webui script? I've seen a few do that, for me at least reverting back to a old version fixed it for me. Funny because this change made me think xformers was the issue, I guess I'll have to give it another chance I was harsh
I'm not sure how related this is, but I haven't seen anybody else mention it. Loading a model in the webui, including at launch, has a coinflip's chance of maxing out my 8GB vram instantly and freezing my PC entirely. Has anybody else experienced this issue? This has been a thing since a few pulls now, even before the suspension. I have been running the webui inside a docker image on Ubuntu 20.04 with rocm and an RX 5700 XT AMD card.
Having the same issue with just loading the Webui immediately uses and keeps using 5 out of the 8 GB of VRAM all since the new hires fix was implemented (most common error it OoM's on has to do with resolution scaling (even with hires fix disabled).. am not using SD2.x models at all so those should not be the issue.
with each generation the amount of VRAM in use seems to increase by a few MB ... (which stacks up fast over time) ... img2img is a no go at all as it immediately OoM's
Same issue here.
RuntimeError: CUDA out of memory. Tried to allocate 76.38 GiB (GPU 0; 12.00 GiB total capacity; 2.57 GiB already allocated; 7.19 GiB free; 2.58 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF Time taken: 16.44sTorch active/reserved: 2757/2774 MiB, Sys VRAM: 5051/12288 MiB (41.11%)
See possible source in "new hires": https://github.com/AUTOMATIC1111/stable-diffusion-webui/issues/6725
I do not use Hires Fix, but I can no longer change models on Colab because it causes memory overflow:
--lowram, --lowvram and --medvram options no helped. This is the default RAM reservation at startup:
Update: I found a solution:
Regardless, I saw that every time I change the model, it occupies 1 GB more memory, so after a while it causes a memory overflow again.
I have this problem as well. It consists of..
Hey guys, I got a similar issue : I updated the UI, and for some reason the VRAM usage skyrocketted. It turned out I had to remove the command lines that starts updates at launch. Literally half of my VRAM (3GB out of 6) was taken from the start of the software, and after removing both command lines ("git pull" and the one line to update torch), the VRAM usage became normal again.
So if you just updated the UI and you're now running out of VRAM, remove the command lines for the updates. Hopefully it helps!
Hey guys, I got a similar issue : I updated the UI, and for some reason the VRAM usage skyrocketted. It turned out I had to remove the command lines that starts updates at launch. Literally half of my VRAM (3GB out of 6) was taken from the start of the software, and after removing both command lines ("git pull" and the one line to update torch), the VRAM usage became normal again.
So if you just updated the UI and you're now running out of VRAM, remove the command lines for the updates. Hopefully it helps!
Which file did you edit? I don't have any command lines in the webui-user.bat for that, and there isn't any Git Pull or Torch in the webui.bat
Hey guys, I got a similar issue : I updated the UI, and for some reason the VRAM usage skyrocketted. It turned out I had to remove the command lines that starts updates at launch. Literally half of my VRAM (3GB out of 6) was taken from the start of the software, and after removing both command lines ("git pull" and the one line to update torch), the VRAM usage became normal again. So if you just updated the UI and you're now running out of VRAM, remove the command lines for the updates. Hopefully it helps!
Which file did you edit? I don't have any command lines in the webui-user.bat for that, and there isn't any Git Pull or Torch in the webui.bat
The launcher (the webui-user.bat file). I had put two command lines for the updates, thinking it would only affect the launch, but it was actually taking 3GB VRAM for no reason.
In your case that doesn't seem to be the issue. Sorry I can't help ^^'.
I've made two PRs that I think will finally address this. voldy (auto) has also made recent improvements to the dev
branch in https://github.com/AUTOMATIC1111/stable-diffusion-webui/commit/0af4127fd14360ebb12c6569d98aebf8047abbfc and https://github.com/AUTOMATIC1111/stable-diffusion-webui/commit/ccb92339348f6973de39cde062982a51a4cd0818 that should improve this as well. Basically, if you miss the performance of hires fix in the early days before https://github.com/AUTOMATIC1111/stable-diffusion-webui/commit/ef27a18b6b7cb1a8eebdc9b2e88d25baf2c2414d changed it, I think this now fixes it. Note you should be using --medvram
(or --lowvram
), not using --no-half-vae
, and using a high-performance optimizer like xformers
to take the most advantage of these.
https://github.com/AUTOMATIC1111/stable-diffusion-webui/pull/12514 https://github.com/AUTOMATIC1111/stable-diffusion-webui/pull/12515
I also closed https://github.com/AUTOMATIC1111/stable-diffusion-webui/issues/6725 and https://github.com/AUTOMATIC1111/stable-diffusion-webui/issues/7002 since this issue is the most relevant. The former was just asking for old hires fix to be added back (where width/height is specified manually, which is supported) and the latter is technically a duplicate of this issue.
Closing this as I've done a few tests and VRAM usage is significantly lower as of the latest dev
branch commit. In the scenario given in OP, VRAM peaks just under 6GB, which fits well within their given criteria. Open a new issue with more specifics if problems still occur.
Is there an existing issue for this?
What happened?
I updated the WebUI a few minutes ago and now the VRAM usage when generating an image is way higher. I have 3 monitors (2x 1920x1080 & 1x 2560x1440), I use Wallpaper Engine on all of them, but I have Discord open on of them nearly 24/7, so Wallpaper Engine is only active for two monitors. 1.5 GB VRAM are used when I am on the Desktop without the WebUI running. Web Browers: Microsoft Edge (Chromium) OS: Windows 11 (Build number: 22621.963) GPU: NVIDIA GeForce RTX 3070 Ti (KFA2) CPU: Intel Core i7-11700K RAM: Corsair VENGEANCE LPX 32 GB (2 x 16 GB) DDR4 DRAM 3200 MHz C16
Steps to reproduce the problem
What should have happened?
The generation should complete without any errors
Commit where the problem happens
1cfd8aec4ae5a6ca1afd67b44cb4ef6dd14d8c34
What platforms do you use to access UI ?
Windows
What browsers do you use to access the UI ?
Microsoft Edge
Command Line Arguments
Additional information, context and logs
I have the config for
animefull
from the Novel AI leak in the configs folder under the nameAnything V3.0.yaml
, but I get this error too when I remove it from the configs folder and completely restart the WebUI. This is the error I get