lshqqytiger / stable-diffusion-webui-amdgpu

Stable Diffusion web UI
GNU Affero General Public License v3.0
1.76k stars 180 forks source link

Could not allocate tensor with 377487360 bytes. There is not enough GPU video memory available! #38

Closed imamqaum1 closed 6 months ago

imamqaum1 commented 1 year ago

Is there an existing issue for this?

What happened?

Stable diffusion crash, after generating some pixel and appear error : Could not allocate tensor with 377487360 bytes. There is not enough GPU video memory available! Screenshot 2023-03-11 045325

Steps to reproduce the problem

  1. Go to Text2Img
  2. Insert prompt and negative promt
  3. Generating Screenshot 2023-03-11 045445

What should have happened?

Stable diffusion running normally, and generating some image

Commit where the problem happens

RuntimeError: Could not allocate tensor with 377487360 bytes. There is not enough GPU video memory available!

What platforms do you use to access the UI ?

Windows

What browsers do you use to access the UI ?

Microsoft Edge

Command Line Arguments

--lowvram --disable-nan-check --autolaunch --no-half

List of extensions

No

Console logs

venv "D:\Data Imam\Imam File\web-ui\stable-diffusion-webui-directml\venv\Scripts\Python.exe"
Python 3.10.6 (tags/v3.10.6:9c7b4bd, Aug  1 2022, 21:53:49) [MSC v.1932 64 bit (AMD64)]
Commit hash: ff558348682fea569785dcfae1f1282cfbefda6b
Installing requirements for Web UI
Launching Web UI with arguments: --lowvram --disable-nan-check --autolaunch --no-half
Warning: experimental graphic memory optimization is disabled due to gpu vendor. Currently this optimization is only available for AMDGPUs.
Disabled experimental graphic memory optimizations.
Interrogations are fallen back to cpu. This doesn't affect on image generation. But if you want to use interrogate (CLIP or DeepBooru), check out this issue: https://github.com/lshqqytiger/stable-diffusion-webui-directml/issues/10
Warning: caught exception 'Torch not compiled with CUDA enabled', memory monitor disabled
No module 'xformers'. Proceeding without it.
Loading weights [bfcaf07557] from D:\Data Imam\Imam File\web-ui\stable-diffusion-webui-directml\models\Stable-diffusion\768-v-ema.ckpt
Creating model from config: D:\Data Imam\Imam File\web-ui\stable-diffusion-webui-directml\repositories\stable-diffusion-stability-ai\configs\stable-diffusion\v2-inference-v.yaml
LatentDiffusion: Running in v-prediction mode
DiffusionWrapper has 865.91 M params.
Applying cross attention optimization (InvokeAI).
Textual inversion embeddings loaded(0):
Model loaded in 235.4s (load weights from disk: 133.7s, find config: 48.2s, load config: 0.3s, create model: 3.4s, apply weights to model: 40.4s, apply dtype to VAE: 0.8s, load VAE: 2.6s, move model to device: 5.0s, hijack: 0.1s, load textual inversion embeddings: 0.8s).
Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.
Calculating sha256 for D:\Data Imam\Imam File\web-ui\stable-diffusion-webui-directml\models\Stable-diffusion\aresMix_v01.safetensors: 6ecece11bf069e9950746d33ab346826c5352acf047c64a3ab74c8884924adf0
Loading weights [6ecece11bf] from D:\Data Imam\Imam File\web-ui\stable-diffusion-webui-directml\models\Stable-diffusion\aresMix_v01.safetensors
Creating model from config: D:\Data Imam\Imam File\web-ui\stable-diffusion-webui-directml\configs\v1-inference.yaml
LatentDiffusion: Running in eps-prediction mode
DiffusionWrapper has 859.52 M params.
Applying cross attention optimization (InvokeAI).
Model loaded in 42.4s (create model: 1.7s, apply weights to model: 40.2s, load textual inversion embeddings: 0.2s).
100%|██████████████████████████████████████████████████████████████████████████████████| 20/20 [02:48<00:00,  8.41s/it]
Error completing request███████████████████████████████████████████████████████████████| 20/20 [02:05<00:00,  6.39s/it]
Arguments: ('task(c6cyhnv8oj55v19)', 'photo of a 22 years old Japanese girl, detailed facial features, beautiful detailed face, perfect face, dreamy face expression, high detailed skin, white skin texture, detailed eyes, seductive eyes, alluring eyes, beautiful eyes, full red lips, hourglass body, perfect body, skinny, petite, red pussy, showing pussy, nude, small breast, sitting, hijab, hijab, elegant, sexually suggestive, sex appeal, seductive look, bedroom, submissive, fantasy environment, magical atmosphere, dramatic style, golden hour, embers swirling, soft lighting, volumetric lighting, realistic lighting, cinematic lighting, natural lighting, long exposure trails, hyper detailed, sharp focus, bokeh, masterpiece, award winning photograph, epic character composition,Key light, backlight, soft natural lighting, photography 800 ISO film grain 50mm lens RAW aperture f1.6, highly detailed, Girl, full body, full body view, full body shoot, full body photograph', '(asian:1.2), black and white, sepia, bad art, b&w, canvas frame, cartoon, 3d, Photoshop, video game, 3d render, semi-realistic, cgi, render, sketch, drawing, anime, worst quality, low quality, jpeg artifacts, duplicate, messy drawing, black-white, doll, illustration, lowres, deformed, disfigured, mutation, amputation, distorted, mutated, mutilated, poorly drawn, bad anatomy, wrong anatomy, bad proportions, gross proportions, double body, long body, unnatural body, extra limb, missing limb, floating limb, disconnected limbs, malformed limbs, missing arms, extra arms, disappearing arms, missing legs, extra legs, broken legs, disappearing legs, deformed thighs, malformed hands, mutated hands and fingers, double hands, extra fingers, poorly drawn hands, mutated hands, fused fingers, too many fingers, poorly drawn feet, poorly drawn hands, big hands, hand with more than 5 fingers, hand with less than 5 fingers, bad feet, poorly drawn feet, fused feet, missing feet, bad knee, extra knee, more than 2 legs, poorly drawn face, cloned face, double face, bad hairs, poorly drawn hairs, fused hairs, cross-eye, ugly eyes, bad eyes, poorly drawn eyes, asymmetric eyes, cross-eyed, ugly mouth, missing teeth, crooked teeth, bad mouth, poorly drawn mouth, dirty teeth, bad tongue, fused ears, bad ears, poorly drawn ears, extra ears, heavy ears, missing ears, poorly drawn breasts, more than 2 nipples, missing nipples, different nipples, fused nipples, bad nipples, poorly drawn nipples, bad asshole, poorly drawn asshole, fused asshole, bad anus, bad pussy, bad crotch, fused anus, fused pussy, poorly drawn crotch, poorly drawn anus, poorly drawn pussy, bad clit, fused clit, fused pantie, poorly drawn pantie, fused cloth, poorly drawn cloth, bad pantie, obese, ugly, disgusting, morbid, big muscles, blurry, censored, oversaturated, watermark, watermarked, extra digit, fewer digits, signature, text', [], 20, 15, False, False, 1, 1, 6, -1.0, -1.0, 0, 0, 0, False, 720, 512, False, 0.7, 2, 'Latent', 0, 0, 0, [], 0, False, False, 'positive', 'comma', 0, False, False, '', 1, '', 0, '', 0, '', True, False, False, False, 0) {}
Traceback (most recent call last):
  File "D:\Data Imam\Imam File\web-ui\stable-diffusion-webui-directml\modules\call_queue.py", line 56, in f
    res = list(func(*args, **kwargs))
  File "D:\Data Imam\Imam File\web-ui\stable-diffusion-webui-directml\modules\call_queue.py", line 37, in f
    res = func(*args, **kwargs)
  File "D:\Data Imam\Imam File\web-ui\stable-diffusion-webui-directml\modules\txt2img.py", line 56, in txt2img
    processed = process_images(p)
  File "D:\Data Imam\Imam File\web-ui\stable-diffusion-webui-directml\modules\processing.py", line 486, in process_images
    res = process_images_inner(p)
  File "D:\Data Imam\Imam File\web-ui\stable-diffusion-webui-directml\modules\processing.py", line 634, in process_images_inner
    x_samples_ddim = [decode_first_stage(p.sd_model, samples_ddim[i:i+1].to(dtype=devices.dtype_vae))[0].cpu() for i in range(samples_ddim.size(0))]
  File "D:\Data Imam\Imam File\web-ui\stable-diffusion-webui-directml\modules\processing.py", line 634, in <listcomp>
    x_samples_ddim = [decode_first_stage(p.sd_model, samples_ddim[i:i+1].to(dtype=devices.dtype_vae))[0].cpu() for i in range(samples_ddim.size(0))]
  File "D:\Data Imam\Imam File\web-ui\stable-diffusion-webui-directml\modules\processing.py", line 423, in decode_first_stage
    x = model.decode_first_stage(x)
  File "D:\Data Imam\Imam File\web-ui\stable-diffusion-webui-directml\modules\sd_hijack_utils.py", line 17, in <lambda>
    setattr(resolved_obj, func_path[-1], lambda *args, **kwargs: self(*args, **kwargs))
  File "D:\Data Imam\Imam File\web-ui\stable-diffusion-webui-directml\modules\sd_hijack_utils.py", line 28, in __call__
    return self.__orig_func(*args, **kwargs)
  File "D:\Data Imam\Imam File\web-ui\stable-diffusion-webui-directml\venv\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "D:\Data Imam\Imam File\web-ui\stable-diffusion-webui-directml\repositories\stable-diffusion-stability-ai\ldm\models\diffusion\ddpm.py", line 826, in decode_first_stage
    return self.first_stage_model.decode(z)
  File "D:\Data Imam\Imam File\web-ui\stable-diffusion-webui-directml\modules\lowvram.py", line 52, in first_stage_model_decode_wrap
    return first_stage_model_decode(z)
  File "D:\Data Imam\Imam File\web-ui\stable-diffusion-webui-directml\repositories\stable-diffusion-stability-ai\ldm\models\autoencoder.py", line 90, in decode
    dec = self.decoder(z)
  File "D:\Data Imam\Imam File\web-ui\stable-diffusion-webui-directml\venv\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "D:\Data Imam\Imam File\web-ui\stable-diffusion-webui-directml\repositories\stable-diffusion-stability-ai\ldm\modules\diffusionmodules\model.py", line 637, in forward
    h = self.up[i_level].block[i_block](h, temb)
  File "D:\Data Imam\Imam File\web-ui\stable-diffusion-webui-directml\venv\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "D:\Data Imam\Imam File\web-ui\stable-diffusion-webui-directml\repositories\stable-diffusion-stability-ai\ldm\modules\diffusionmodules\model.py", line 132, in forward
    h = nonlinearity(h)
  File "D:\Data Imam\Imam File\web-ui\stable-diffusion-webui-directml\venv\lib\site-packages\torch\nn\functional.py", line 2059, in silu
    return torch._C._nn.silu(input)
RuntimeError: Could not allocate tensor with 377487360 bytes. There is not enough GPU video memory available!

Additional information

RX 570 4GB Ryzen 5 3500 RAM 8GB single channel Driver AMD Software PRO Edition DirectX 12

Eleiyas commented 1 year ago

Inside the webui_user.bat:

set COMMANDLINE_ARGS=--medvram --precision full --no-half --no-half-vae --opt-split-attention-v1 --opt-sub-quad-attention --disable-nan-check
set SAFETENSORS_FAST_GPU=1

Works for me with a 6800XT card (16GB). I can now actually generate stuff over 512*512 without it immediately crashing. Still get some issues, but I can generate 10s of images before it even thinks of being weird.

Any of the other commandline args I see other people use make the program completely hang and refuse to generate anything, so if you have the same card as I do, just use what I put in the codeblock ;)

Miraihi commented 1 year ago

After the token merging update you pretty much have to set token merging to about 0.5 and negative segma to about 3 (Optimizations tab in Options). Gives a great boost in performance and memory efficency without sacrificing much. But you can't use the subquadratic optimization with token merging.

Grathew commented 1 year ago

This seems to break my install; lots of black images.

On Tue, Jul 18, 2023 at 7:01 PM Miraihi @.***> wrote:

After the token merging update you pretty much have to set token merging to about 0.5 and negative segma to about 3 (Optimizations tab in Options). Gives a great boost in performance and memory efficency without sacrificing much.

— Reply to this email directly, view it on GitHub https://github.com/lshqqytiger/stable-diffusion-webui-directml/issues/38#issuecomment-1641096569, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA7COGLQOJOYKGVY7KGYTYTXQ4IUTANCNFSM6AAAAAAVW7YLEU . You are receiving this because you commented.Message ID: @.*** com>

Grathew commented 1 year ago

This doesn't seem to be helping much. I have less crashes, but more empty black images. I have a 6700XT (12GB) if that helps explain it.

On Tue, Jul 18, 2023 at 6:58 PM Eleiyas @.***> wrote:

Inside the webui_user.bat:

set COMMANDLINE_ARGS=--medvram --precision full --no-half --no-half-vae --opt-split-attention-v1 --opt-sub-quad-attention --disable-nan-check set SAFETENSORS_FAST_GPU=1

Works for me with a 6800XT card (16GB). I can now actually generate stuff over 512*512 without it immediately crashing. Still get some issues, but I can generate 10s of images before it even thinks of being weird.

Any of the other commandline args I see other people use make the program completely hang and refuse to generate anything.

— Reply to this email directly, view it on GitHub https://github.com/lshqqytiger/stable-diffusion-webui-directml/issues/38#issuecomment-1641094334, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA7COGJJVWXKNZCH5YNJHWLXQ4IINANCNFSM6AAAAAAVW7YLEU . You are receiving this because you commented.Message ID: @.*** com>

Miraihi commented 1 year ago

@Grathew I mentioned that you can't use sub-quad-attention with token merging. Choose Doggetx or V1. If you're not aware, only one of --opt arguments can be active at the time. Having several of them just keep only one of them active and others inactive.

upright2003 commented 11 months ago

I reinstalled the new version of AMD DRIVER, and the pictures can appear normally with a resolution of 960X540 and my graphics card RX5500XT 4GB. But it is almost impossible to use the Hires. fix function. How to set it?

evascroll commented 11 months ago

I just fix mine using this arg(set COMMANDLINE_ARGS= --precision full --no-half --no-half-vae --opt-split-attention-v1 --opt-sub-quad-attention --disable-nan-check) I'm been trying to generate for a couple days with out any progress, after adding the arg , i now can generate 512x512 hire fix 1024x1024(upscale by 2)8 count, up to 50 step, using 3 control net no problem, im using the latest amd driver 23.9.3,lates chipset, spec( windows 11,cpu amd 5800x-gpu asus dual 6700xt oc- 32g ram) controlnet 1.1.410! a1111 fork from lshqqytiger , check point 1.5, 2.0, 2.1(sdxl no luck, still testing) hope it help

AuroraTheDragon commented 9 months ago

First - the arguments. Second - not sure what's the maximum resolution your GPU is capable of. I can generate a maximum of 600x800 on my RX 580 (8Gb) with arguments --medvram --precision full --no-half --no-half-vae --opt-split-attention-v1 --opt-sub-quad-attention --disable-nan-check.

I am having a similar issue. I have an RX 580 that has 8gb of vram, and 2x16gb ram. About 5 days ago, I was still able to generate images well above 768x512, and can even upscale it to around... 4x I believe, no issues at all. But all of a sudden, yesterday, it just stopped working, claiming that I don't have enough GPU video memory available. I tried uninstalling everything, (python 3.10.6, git, and stable diffusion), and then reinstalled everything. It still didn't work. I'm really hoping that this isn't a graphics card problem, which I think it really isn't because, I can run triple A games pretty smoothly without crashes or anything, so maybe it has something to do with Stable Diffusion's latest updates and all that.

Sepacc commented 9 months ago

i have rx580/8gb and 2x8 ram, tried arguments mentioned before and it works kinda well for me (at least i can generate 600x800 now, in the past i was getting an error every 2-3 512x512 images and on 600x800 it was straight error). Also im using official SD.NEXT if its important

Menober commented 9 months ago

same error with memory allocation. No way to chunk this data?

FrancoContegni commented 9 months ago

I can't find the webui-user.bat file

thedevtechs commented 9 months ago

First - the arguments. Second - not sure what's the maximum resolution your GPU is capable of. I can generate a maximum of 600x800 on my RX 580 (8Gb) with arguments --medvram --precision full --no-half --no-half-vae --opt-split-attention-v1 --opt-sub-quad-attention --disable-nan-check.

Thank you!! Got me up & running on my AMD RX 6600 (finally)

thisisnotreal459 commented 8 months ago

my 6800 ,win 11 pro, 22H2 Adrenalin Edition 23.4.1

1, it is important for me - folder SD is in the root of drive C 2 .Open CMD in the root of the directory stable-diffusion-webui-directml.

git pull to ensure latest update pip install -r requirements.txt

<- it was at this point I knew I effed up during initial setup because I saw several missing items getting installed. 3 For the webui-user.bat file, I added the following line set COMMANDLINE_ARGS=--medvram --precision full --no-half --no-half-vae --opt-sub-quad-attention --opt-split-attention --opt-split-attention-v1 --disable-nan-check --autolaunch

result 1024*1024

euler a ---- MAX 26/26 [01:16<00:00, 2.96s/it] dpm++2m karras -----MAX 26/26 [02:19<00:18, 6.05s/it]

with my trained model .cpkl

3

model deliberate_v2 .safetensors 1024x1280 DPM++2m Karras ----- max 26/26 [01:50<00:00, 4.24s/it]

I usually generate 440 * 640, 4 pictures each and then the necessary upscale from Topaz Photo AI

Good luck

p.s. 1280*1280 RuntimeError: Could not allocate tensor with 377487360 bytes. There is not enough GPU video memory available! -)))

So I went ahead and tried this solution, and it was after I did "pip install -r requirements.txt" step when things went wrong for me. Now whenever I run webui-user.bat it spits out this:

venv "M:\Program Files\Stable Diffusion\stable-diffusion-webui-directml\venv\Scripts\Python.exe"
fatal: No names found, cannot describe anything.
Python 3.10.6 (tags/v3.10.6:9c7b4bd, Aug  1 2022, 21:53:49) [MSC v.1932 64 bit (AMD64)]
Version: 1.7.0
Commit hash: cfa6e40e6d7e290b52940253bf705f282477b890
Traceback (most recent call last):
  File "M:\Program Files\Stable Diffusion\stable-diffusion-webui-directml\launch.py", line 48, in <module>
    main()
  File "M:\Program Files\Stable Diffusion\stable-diffusion-webui-directml\launch.py", line 39, in main
    prepare_environment()
  File "M:\Program Files\Stable Diffusion\stable-diffusion-webui-directml\modules\launch_utils.py", line 560, in prepare_environment
    raise RuntimeError(
RuntimeError: Torch is not able to use GPU; add --skip-torch-cuda-test to COMMANDLINE_ARGS variable to disable this check
Press any key to continue . . .
Sairu9 commented 6 months ago

i followed this tutorial : https://www.youtube.com/watch?v=mKxt0kxD5C0&t=1087s&ab_channel=FE-Engineer

and then added For the webui-user.bat file, I added the following line set COMMANDLINE_ARGS=--use-directml --medvram --precision full --no-half --no-half-vae --opt-split-attention-v1 --opt-sub-quad-attention --disable-nan-check

when i add --medvram --precision full --no-half --no-half-vae --opt-split-attention-v1 --opt-sub-quad-attention --disable-nan-check prompt its only working in 1.5 models but not in XL models.Adding the prompt speeds up the generation significantly.but losing the xl models.

rx 6800 gpu

JonathanDiez commented 6 months ago

added both and nothing 🤷, my gpu is a rx 6600

U fixed this?

CoolnJuicy commented 5 months ago

For me it was it the simple combo of adding --medvram to the bat file and checking the LVRAM box in Controlnet. I installed Controlnet last night. Come morning I was getting the OP's error. This worked.

Ryzen 3600, RX580,16G.

zmsoft commented 3 months ago

Does a low-memory graphics card can running only on the CPU? AMD RX 550 2G set COMMANDLINE_ARGS=--use-directml --lowvram --opt-split-attention --enable-insecure-extension-access --skip-torch-cuda-test