pkuliyi2015 / multidiffusion-upscaler-for-automatic1111

Tiled Diffusion and VAE optimize, licensed under CC BY-NC-SA 4.0
Other
4.72k stars 334 forks source link

Out of VRAM memory (RTX 4090 - from 1k to 2k upscale) #125

Closed jurandfantom closed 1 year ago

jurandfantom commented 1 year ago

Hey, its me...again. Since past few weeks I get information that i just can't upscale pictures, because im constantly out of VRAM. I will post all I know and hopefuly with your help, it will be possible what happened. Edit: just done fresh install of automatic1111, with cuda1.18, CUNN beta, everything beta (what was possible actually), and still same thing. Tinkered tilling values to lower, but still after click, 19GB of my VRAM that was free, is not enought to upscale with plugin. Something is broken - updated version of automatic do not contain xformers as those are no needed with torch 2.0 and 2.1 that installed. All rest I leave as oryginal post because things behave in same way.

------------ Since when: 2 weeks. I thought its because of my plugin being outdates - i prefer to not jump right away to newest when things works - but learned that to update plugins, sometime its require to update automatic1111 (and considering last issue after weird commit that break everything, I prefered to avoid things). Currently I updated everything (automatic and plugin).

------------Additional info: Some time ago I followed suggestion of update few things in requirements of webui that speedup RTX4000 series by 200% (and indeed its way faster - so hope its not a issue because I would sacrifice plugin for those speeds). But im not even sure its because of that or not.
23,04,11 - 16,51,19 - 7618 a

------------ Current versions: blendmodes==2022 transformers==4.25.1 accelerate==0.17.1 basicsr==1.4.2 pyre-extensions==0.0.23 gfpgan==1.3.8 gradio==3.23 numpy==1.23.3 Pillow==9.4.0 realesrgan==0.3.0 torch omegaconf==2.2.3 pytorch_lightning==1.9.4 scikit-image==0.19.2 fonts font-roboto timm==0.6.7 piexif==1.1.3 einops==0.4.1 jsonmerge==1.8.0 clean-fid==0.1.29 resize-right==0.0.2 torchdiffeq==0.2.3 kornia==0.6.7 lark==1.1.2 inflection==0.5.1 GitPython==3.1.30 torchsde==0.2.5 safetensors==0.3.0 httpcore<=0.15 fastapi==0.94.0 python: 3.10.6  •  torch: 2.0.0+cu118  •  xformers: 0.0.17+b3d75b3.d20230321  •  gradio: 3.16.2  •  commit: [a9fed7c3]

------------ Error code: MultiDiffusion Sampling: : 0it [01:34, ?it/s] MultiDiffusion Sampling: : 0it [01:15, ?it/s] [Mem] rss: 8.904 GB, vms: 17.141 GB [VRAM] free: 17.076 GB, total: 23.988 GB Traceback (most recent call last): File "E:\Magazyn\Grafika\AI\Automatic1111\stable-diffusion-webui\venv\lib\site-packages\gradio\routes.py", line 394, in run_predict output = await app.get_blocks().process_api( File "E:\Magazyn\Grafika\AI\Automatic1111\stable-diffusion-webui\venv\lib\site-packages\gradio\blocks.py", line 1073, in process_api inputs = self.preprocess_data(fn_index, inputs, state) File "E:\Magazyn\Grafika\AI\Automatic1111\stable-diffusion-webui\venv\lib\site-packages\gradio\blocks.py", line 962, in preprocess_data processed_input.append(block.preprocess(inputs[i])) IndexError: list index out of range Traceback (most recent call last): File "E:\Magazyn\Grafika\AI\Automatic1111\stable-diffusion-webui\venv\lib\site-packages\gradio\routes.py", line 394, in run_predict output = await app.get_blocks().process_api( File "E:\Magazyn\Grafika\AI\Automatic1111\stable-diffusion-webui\venv\lib\site-packages\gradio\blocks.py", line 1073, in process_api inputs = self.preprocess_data(fn_index, inputs, state) File "E:\Magazyn\Grafika\AI\Automatic1111\stable-diffusion-webui\venv\lib\site-packages\gradio\blocks.py", line 962, in preprocess_data processed_input.append(block.preprocess(inputs[i])) IndexError: list index out of range [Mem] rss: 8.825 GB, vms: 16.953 GB [VRAM] free: 17.076 GB, total: 23.988 GB [Tiled Diffusion] upscaling image with ESRGAN_4x... [Tiled Diffusion] ControlNet found, support is enabled. MultiDiffusion Sampling: : 0it [00:00, ?it/s]MultiDiffusion hooked into DPM++ 2M Karras sampler. Tile size: 96x96, Tile batches: 25, Batch size: 1 [Tiled VAE]: the input size is tiny and unnecessary to tile. Error completing request Arguments: ('task(yzpf3jbt47rbi5c)', 0, 'top to down photo of old glossy and weared leather, flat surface, (in focus), (no depth of field), (masterpiece:1.2), (best quality), (8k), (HDR), (wallpaper), (cinematic lighting),', 'relfection, surface reflection, light reflections, (depth_of_field), (out-of-focus), (cartoon), (illustration), (saturated), (grain), (deformed), (poorly drawn), (lowres), (lowpoly), (CG), (3d), (blurry), (duplicate), (watermark), (label), (signature), (text), (cropped),', [], <PIL.Image.Image image mode=RGBA size=1024x1024 at 0x1EA6894CF70>, None, None, None, None, None, None, 41, 8, 4, 0, 1, False, False, 2, 3, 4, 1.5, 0.5, -1.0, -1.0, 0, 0, 0, False, 1024, 1024, 0, 0, 32, 0, '', '', '', [], 0, False, True, False, 0, -1, True, 'MultiDiffusion', False, 10, 1, 1, 64, False, True, 1024, 1024, 96, 96, 48, 1, 'ESRGAN_4x', 2, False, False, False, False, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, True, False, True, True, 0, 3072, 192, True, False, 1, False, False, False, 1.1, 1.5, 100, 0.7, False, False, True, False, False, 0, 'Gustavosta/MagicPrompt-Stable-Diffusion', '', False, False, 'LoRA', 'None', 0, 0, 'LoRA', 'None', 0, 0, 'LoRA', 'None', 0, 0, 'LoRA', 'None', 0, 0, 'LoRA', 'None', 0, 0, None, 'Refresh models', <scripts.external_code.ControlNetUnit object at 0x000001EA68B0F640>, <scripts.external_code.ControlNetUnit object at 0x000001EA68B0F460>, <scripts.external_code.ControlNetUnit object at 0x000001EA68B0C760>, <scripts.external_code.ControlNetUnit object at 0x000001EA68B0E740>, False, '', 0.5, True, False, '', 'Lerp', False, False, '1:1,1:2,1:2', '0:0,0:0,0:1', '0.2,0.8,0.8', 150, 0.2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, '', '

\n', True, True, '', '', True, 50, True, 1, 0, False, 4, 0.5, 'Linear', 'None', '

Recommended settings: Sampling Steps: 80-100, Sampler: Euler a, Denoising strength: 0.8

', 128, 8, ['left', 'right', 'up', 'down'], 1, 0.05, 128, 4, 0, ['left', 'right', 'up', 'down'], False, False, 'positive', 'comma', 0, False, False, '', '

Will upscale the image by the selected scale factor; use width and height sliders to set tile size

', 64, 0, 2, 1, '', 0, '', 0, '', True, False, False, False, 0, 'Blur First V1', 0.25, 10, 10, 10, 10, 1, False, '', '', 0.5, 1, False, 'Not set', True, True, '', '', '', '', '', 1.3, 'Not set', 'Not set', 1.3, 'Not set', 1.3, 'Not set', 1.3, 1.3, 'Not set', 1.3, 'Not set', 1.3, 'Not set', 1.3, 'Not set', 1.3, 'Not set', 1.3, 'Not set', False, 'None', '', 'Bloom It!', '', False, True, False, True, True, 'Create in UI', False, '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', None, False, None, False, None, False, None, False, 50, 'Positive', 0, ', ', 'Generate and always save', 32, '

Will upscale the image depending on the selected target size type

', 512, 0, 8, 32, 64, 0.35, 32, 0, True, 0, False, 8, 0, 0, 2048, 2048, 2) {} Traceback (most recent call last): File "E:\Magazyn\Grafika\AI\Automatic1111\stable-diffusion-webui\modules\call_queue.py", line 56, in f res = list(func(*args, kwargs)) File "E:\Magazyn\Grafika\AI\Automatic1111\stable-diffusion-webui\modules\call_queue.py", line 37, in f res = func(*args, *kwargs) File "E:\Magazyn\Grafika\AI\Automatic1111\stable-diffusion-webui\modules\img2img.py", line 172, in img2img processed = process_images(p) File "E:\Magazyn\Grafika\AI\Automatic1111\stable-diffusion-webui\modules\processing.py", line 503, in process_images res = process_images_inner(p) File "E:\Magazyn\Grafika\AI\Automatic1111\stable-diffusion-webui\modules\processing.py", line 594, in process_images_inner p.init(p.all_prompts, p.all_seeds, p.all_subseeds) File "E:\Magazyn\Grafika\AI\Automatic1111\stable-diffusion-webui\modules\processing.py", line 1056, in init self.init_latent = self.sd_model.get_first_stage_encoding(self.sd_model.encode_first_stage(image)) File "E:\Magazyn\Grafika\AI\Automatic1111\stable-diffusion-webui\modules\sd_hijack_utils.py", line 17, in setattr(resolved_obj, func_path[-1], lambda args, kwargs: self(*args, kwargs)) File "E:\Magazyn\Grafika\AI\Automatic1111\stable-diffusion-webui\modules\sd_hijack_utils.py", line 28, in call return self.__orig_func(*args, *kwargs) File "E:\Magazyn\Grafika\AI\Automatic1111\stable-diffusion-webui\venv\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context return func(args, kwargs) File "E:\Magazyn\Grafika\AI\Automatic1111\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\models\diffusion\ddpm.py", line 830, in encode_first_stage return self.first_stage_model.encode(x) File "E:\Magazyn\Grafika\AI\Automatic1111\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\models\autoencoder.py", line 83, in encode h = self.encoder(x) File "E:\Magazyn\Grafika\AI\Automatic1111\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl return forward_call(*args, kwargs) File "E:\Magazyn\Grafika\AI\Automatic1111\stable-diffusion-webui\extensions\multidiffusion-upscaler-for-automatic1111\scripts\vae_optimize.py", line 478, in call return self.net.original_forward(x) File "E:\Magazyn\Grafika\AI\Automatic1111\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\modules\diffusionmodules\model.py", line 526, in forward h = self.down[i_level].block[i_block](hs[-1], temb) File "E:\Magazyn\Grafika\AI\Automatic1111\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl return forward_call(*args, *kwargs) File "E:\Magazyn\Grafika\AI\Automatic1111\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\modules\diffusionmodules\model.py", line 138, in forward h = self.norm2(h) File "E:\Magazyn\Grafika\AI\Automatic1111\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl return forward_call(args, kwargs) File "E:\Magazyn\Grafika\AI\Automatic1111\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\normalization.py", line 273, in forward return F.group_norm( File "E:\Magazyn\Grafika\AI\Automatic1111\stable-diffusion-webui\venv\lib\site-packages\torch\nn\functional.py", line 2530, in group_norm return torch.group_norm(input, num_groups, weight, bias, eps, torch.backends.cudnn.enabled) torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 6.00 GiB (GPU 0; 23.99 GiB total capacity; 16.41 GiB already allocated; 1.30 GiB free; 19.42 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

PotatoBananaApple commented 1 year ago

If you reduce the VAE tile size does it still complain? Your settings seem quite high

jurandfantom commented 1 year ago

Back after tests and everything. So issue is with lack of xformers. those replacement for xformers that come with torch 2.0 and 2.1 indeed are little bit faster, but not allow generate text2img picture that is 2048x2048, where with xformers it is possible. That lead me to conclusion that to use this extension, you need stick to xformers. Happy for me that I can still just use --xformers with torch 2.1 without issues.

For extension creator: if possible, try find way to solve such issue when somebody use updated torch with --opt-sdp-attention instead --xformers. If there is no other way, then make note that people need to instal xformers to use this plugin.

PotatoBananaApple commented 1 year ago

This extension works without xformers, i used it without it for a while. Maybe the issue is that the xformers reduces the VRAM usage and allows generation of bigger images.

pkuliyi2015 commented 1 year ago

From the picture you can find that your tile size is very large and the Tiled VAE is automatically skipped. Low down the tile size and it should work.