VAE真的很有用 - Githubissues

pkuliyi2015 / multidiffusion-upscaler-for-automatic1111

Tiled Diffusion and VAE optimize, licensed under CC BY-NC-SA 4.0

Other

4.78k stars 337 forks source link

VAE真的很有用 #38

Closed cbisuper closed 1 year ago

cbisuper commented 1 year ago

我有一台配置2G显存N卡的旧笔记本之前最大只能跑大概600400分辨率使用了VAE最大跑到过1200800，偶尔出现黑色块和放大爆显存但大多都是可以成功的这是啥概念呢我自己台式机是A卡的8G显存，因为用不了这个插件我最大直接生成40W-50W像素，就算用低像素放大最多也只跑到过90W像素。几乎和N卡用VAE拉伸效果是差不多的，N卡用这个小4倍显存几乎跑出了差不多像素的图可见这个插件的厉害。

作者能否让A卡也能享受到VAE的快乐？

pkuliyi2015 commented 1 year ago

感谢关注，其实我不太清楚A卡是会报什么错？因为我手上没有A卡，所以一直在隔靴挠痒，能不能截个图什么的？我好理解一下。

qwerkilo commented 1 year ago

感谢关注，其实我不太清楚A卡是会报什么错？因为我手上没有A卡，所以一直在隔靴挠痒，能不能截个图什么的？我好理解一下。

我有a卡，a卡开单精可以用这个插件，就是生成的图左下角有色块，其他问题不大，但不能打开vae到gpu的选项不然会报错。晚点我截个图上来

cbisuper commented 1 year ago

感谢关注，其实我不太清楚A卡是会报什么错？因为我手上没有A卡，所以一直在隔靴挠痒，能不能截个图什么的？我好理解一下。

我有a卡，a卡开单精可以用这个插件，就是生成的图左下角有色块，其他问题不大，但不能打开vae到gpu的选项不然会报错。晚点我截个图上来

我不开on gpu就提示让我打开move to gpu 来启用VAE

cbisuper commented 1 year ago

感谢关注，其实我不太清楚A卡是会报什么错？因为我手上没有A卡，所以一直在隔靴挠痒，能不能截个图什么的？我好理解一下。

系统环境 Win10 22H2 显卡是8G显存的5500xt

Tiled VAE]: input_size: torch.Size([1, 4, 100, 75]), tile_size: 48, padding: 11███████| 38/38 [09:53<00:00, 20.12s/it] [Tiled VAE]: split to 2x2 = 4 tiles. Optimal tile size 32x48, original tile size 48x48

Error completing request Arguments: ('task(fee495zaaudsnc6)', 'masterpiece, best quality, 1girl, aqua eyes, black hair, closed mouth, earrings, multicolored background, hoop earrings, jewelry, looking at viewer, long hair, outdoors, solo, full body, alluring, clean, beautiful face, pure face, pale skin, sexy pose, navel,((luxury dress)) ', 'sketches, (worst quality:2), (low quality:2), (normal quality:2), lowres, normal quality, ((monochrome)), ((grayscale)), skin spots, acnes, skin blemishes, bad anatomy,(long hair:1.4),DeepNegative,(fat:1.2),facing away, looking away,tilted head, lowres,bad anatomy,bad hands, text, error, missing fingers,extra digit, fewer digits, cropped, worstquality, low quality, normal quality,jpegartifacts,signature, watermark, username,blurry,bad feet,cropped,poorly drawn hands,poorly drawn face,mutation,deformed,worst quality,low quality,normal quality,jpeg artifacts,signature,watermark,extra fingers,fewer digits,extra limbs,extra arms,extra legs,malformed limbs,fused fingers,too many fingers,long neck,cross-eyed,mutated hands,polar lowres,bad body,bad proportions,gross proportions,text,error,missing fingers,missing arms,missing legs,extra digit, extra arms, extra leg, extra foot,', [], 28, 13, False, False, 1, 1, 7, -1.0, -1.0, 0, 0, 0, False, 400, 300, True, 0.3, 2, 'Latent', 10, 0, 0, [], 0, False, False, 1024, 1024, True, 64, 64, 32, 1, 'None', 2, False, True, True, True, True, True, 512, 48, False, False, 'none', 'None', 1, None, False, 'Scale to Fit (Inner Fit)', False, False, 64, 64, 64, 1, False, False, False, 'positive', 'comma', 0, False, False, '', 1, '', 0, '', 0, '', True, False, False, False, 0, '', 5, 24, 12.5, 1000, 'DDIM', 0, 64, 64, '', 64, 7.5, 0.42, 'DDIM', 64, 64, 1, 0, 92, True, True, True, False, False, False, 'midas_v21_small', 0, 0, 512, 512, False, False, True, True, True, False, False, 1, False, False, 2.5, 4, 0, False, 0, 1, False, False, 'u2net', False, False, False, False) {} Traceback (most recent call last): File "E:\AICyber\stable-diffusion-webui\modules\call_queue.py", line 56, in f res = list(func(*args, kwargs)) File "E:\AICyber\stable-diffusion-webui\modules\call_queue.py", line 37, in f res = func(*args, *kwargs) File "E:\AICyber\stable-diffusion-webui\modules\txt2img.py", line 56, in txt2img processed = process_images(p) File "E:\AICyber\stable-diffusion-webui\modules\processing.py", line 486, in process_images res = process_images_inner(p) File "E:\AICyber\stable-diffusion-webui\modules\processing.py", line 634, in process_images_inner x_samples_ddim = [decode_first_stage(p.sd_model, samples_ddim[i:i+1].to(dtype=devices.dtype_vae))[0].cpu() for i in range(samples_ddim.size(0))] File "E:\AICyber\stable-diffusion-webui\modules\processing.py", line 634, in x_samples_ddim = [decode_first_stage(p.sd_model, samples_ddim[i:i+1].to(dtype=devices.dtype_vae))[0].cpu() for i in range(samples_ddim.size(0))] File "E:\AICyber\stable-diffusion-webui\modules\processing.py", line 423, in decode_first_stage x = model.decode_first_stage(x) File "E:\AICyber\stable-diffusion-webui\modules\sd_hijack_utils.py", line 17, in setattr(resolved_obj, func_path[-1], lambda args, kwargs: self(*args, kwargs)) File "E:\AICyber\stable-diffusion-webui\modules\sd_hijack_utils.py", line 28, in call return self.__orig_func(*args, *kwargs) File "E:\AICyber\stable-diffusion-webui\Python3.10\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context return func(args, kwargs) File "E:\AICyber\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\models\diffusion\ddpm.py", line 826, in decode_first_stage return self.first_stage_model.decode(z) File "E:\AICyber\stable-diffusion-webui\modules\lowvram.py", line 52, in first_stage_model_decode_wrap return first_stage_model_decode(z) File "E:\AICyber\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\models\autoencoder.py", line 90, in decode dec = self.decoder(z) File "E:\AICyber\stable-diffusion-webui\Python3.10\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl return forward_call(*input, kwargs) File "E:\AICyber\stable-diffusion-webui\scripts\vae_optimize.py", line 481, in call return self.vae_tile_forward(x) File "E:\AICyber\stable-diffusion-webui\scripts\vae_optimize.py", line 369, in wrapper ret = fn(*args, *kwargs) File "E:\AICyber\stable-diffusion-webui\Python3.10\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context return func(args, kwargs) File "E:\AICyber\stable-diffusion-webui\scripts\vae_optimize.py", line 627, in vae_tile_forward tile = z[:, :, input_bbox[2]:input_bbox[3], RuntimeError: Cannot set version_counter for inference tensor

qwerkilo commented 1 year ago

感谢关注，其实我不太清楚A卡是会报什么错？因为我手上没有A卡，所以一直在隔靴挠痒，能不能截个图什么的？我好理解一下。

系统环境 Win10 22H2 显卡是8G显存的5500xt

Tiled VAE]: input_size: torch.Size([1, 4, 100, 75]), tile_size: 48, padding: 11███████| 38/38 [09:53<00:00, 20.12s/it] [Tiled VAE]: split to 2x2 = 4 tiles. Optimal tile size 32x48, original tile size 48x48

Error completing request Arguments: ('task(fee495zaaudsnc6)', 'masterpiece, best quality, 1girl, aqua eyes, black hair, closed mouth, earrings, multicolored background, hoop earrings, jewelry, looking at viewer, long hair, outdoors, solo, full body, alluring, clean, beautiful face, pure face, pale skin, sexy pose, navel,((luxury dress)) lora:jJNgy_jJNgy:1', 'sketches, (worst quality:2), (low quality:2), (normal quality:2), lowres, normal quality, ((monochrome)), ((grayscale)), skin spots, acnes, skin blemishes, bad anatomy,(long hair:1.4),DeepNegative,(fat:1.2),facing away, looking away,tilted head, lowres,bad anatomy,bad hands, text, error, missing fingers,extra digit, fewer digits, cropped, worstquality, low quality, normal quality,jpegartifacts,signature, watermark, username,blurry,bad feet,cropped,poorly drawn hands,poorly drawn face,mutation,deformed,worst quality,low quality,normal quality,jpeg artifacts,signature,watermark,extra fingers,fewer digits,extra limbs,extra arms,extra legs,malformed limbs,fused fingers,too many fingers,long neck,cross-eyed,mutated hands,polar lowres,bad body,bad proportions,gross proportions,text,error,missing fingers,missing arms,missing legs,extra digit, extra arms, extra leg, extra foot,', [], 28, 13, False, False, 1, 1, 7, -1.0, -1.0, 0, 0, 0, False, 400, 300, True, 0.3, 2, 'Latent', 10, 0, 0, [], 0, False, False, 1024, 1024, True, 64, 64, 32, 1, 'None', 2, False, True, True, True, True, True, 512, 48, False, False, 'none', 'None', 1, None, False, 'Scale to Fit (Inner Fit)', False, False, 64, 64, 64, 1, False, False, False, 'positive', 'comma', 0, False, False, '', 1, '', 0, '', 0, '', True, False, False, False, 0, '', 5, 24, 12.5, 1000, 'DDIM', 0, 64, 64, '', 64, 7.5, 0.42, 'DDIM', 64, 64, 1, 0, 92, True, True, True, False, False, False, 'midas_v21_small', 0, 0, 512, 512, False, False, True, True, True, False, False, 1, False, False, 2.5, 4, 0, False, 0, 1, False, False, 'u2net', False, False, False, False) {} Traceback (most recent call last): File "E:\AICyber\stable-diffusion-webui\modules\call_queue.py", line 56, in f res = list(func(*args, kwargs)) File "E:\AICyber\stable-diffusion-webui\modules\call_queue.py", line 37, in f res = func(*args, *kwargs) File "E:\AICyber\stable-diffusion-webui\modules\txt2img.py", line 56, in txt2img processed = process_images(p) File "E:\AICyber\stable-diffusion-webui\modules\processing.py", line 486, in process_images res = process_images_inner(p) File "E:\AICyber\stable-diffusion-webui\modules\processing.py", line 634, in process_images_inner x_samples_ddim = [decode_first_stage(p.sd_model, samples_ddim[i:i+1].to(dtype=devices.dtype_vae))[0].cpu() for i in range(samples_ddim.size(0))] File "E:\AICyber\stable-diffusion-webui\modules\processing.py", line 634, in x_samples_ddim = [decode_first_stage(p.sd_model, samples_ddim[i:i+1].to(dtype=devices.dtype_vae))[0].cpu() for i in range(samples_ddim.size(0))] File "E:\AICyber\stable-diffusion-webui\modules\processing.py", line 423, in decode_first_stage x = model.decode_first_stage(x) File "E:\AICyber\stable-diffusion-webui\modules\sd_hijack_utils.py", line 17, in setattr(resolved_obj, func_path[-1], lambda args, kwargs: self(*args, kwargs)) File "E:\AICyber\stable-diffusion-webui\modules\sd_hijack_utils.py", line 28, in call return self.__orig_func(*args, *kwargs) File "E:\AICyber\stable-diffusion-webui\Python3.10\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context return func(args, kwargs) File "E:\AICyber\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\models\diffusion\ddpm.py", line 826, in decode_first_stage return self.first_stage_model.decode(z) File "E:\AICyber\stable-diffusion-webui\modules\lowvram.py", line 52, in first_stage_model_decode_wrap return first_stage_model_decode(z) File "E:\AICyber\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\models\autoencoder.py", line 90, in decode dec = self.decoder(z) File "E:\AICyber\stable-diffusion-webui\Python3.10\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl return forward_call(*input, kwargs) File "E:\AICyber\stable-diffusion-webui\scripts\vae_optimize.py", line 481, in call return self.vae_tile_forward(x) File "E:\AICyber\stable-diffusion-webui\scripts\vae_optimize.py", line 369, in wrapper ret = fn(*args, *kwargs) File "E:\AICyber\stable-diffusion-webui\Python3.10\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context return func(args, kwargs) File "E:\AICyber\stable-diffusion-webui\scripts\vae_optimize.py", line 627, in vae_tile_forward tile = z[:, :, input_bbox[2]:input_bbox[3], RuntimeError: Cannot set version_counter for inference tensor

是的，我打开vae到gpu也是报这个错

qwerkilo commented 1 year ago

第一张图是我的设置，这个设置是可以出图的，但是出的图都会出现图二这种情况，然后打开vae到gpu就会出现和po主一样的错误

qwerkilo commented 1 year ago

然后是打开vae，设置如上图，之后会爆错误，错误代码如下

[Tiled VAE]: input_size: torch.Size([1, 3, 1536, 1024]), tile_size: 512, padding: 32
[Tiled VAE]: split to 3x2 = 6 tiles. Optimal tile size 480x512, original tile size 512x512
Error completing request
Arguments: ('task(wzt63m65jskm2th)', 0, 'masterpiece, best quality, highres, absurdres, 1girl, solo,   mix4,', 'easynegative, NG_DeepNegative_V1_75T, badhandv4, bad-picture-chill-75v', [], <PIL.Image.Image image mode=RGBA size=512x768 at 0x1346C9BEFE0>, None, None, None, None, None, None, 25, 15, 4, 0, 1, False, False, 1, 1, 7, 1.5, 0.35, -1.0, -1.0, 0, 0, 0, False, 768, 512, 0, 0, 32, 0, '', '', '', [], 0, False, False, True, 1024, 1024, 96, 96, 48, 1, 'None', 2, False, False, 1, False, 1, 0.4, 0.4, 0.2, 0.2, '', '', False, 1, 0.4, 0.4, 0.2, 0.2, '', '', False, 1, 0.4, 0.4, 0.2, 0.2, '', '', False, 1, 0.4, 0.4, 0.2, 0.2, '', '', False, 1, 0.4, 0.4, 0.2, 0.2, '', '', False, 1, 0.4, 0.4, 0.2, 0.2, '', '', True, False, True, 1024, 1024, 64, 64, 32, 1, '4x-UltraSharp', 2, False, False, 1, False, 1, 0.4, 0.4, 0.2, 0.2, '', '', False, 1, 0.4, 0.4, 0.2, 0.2, '', '', False, 1, 0.4, 0.4, 0.2, 0.2, '', '', False, 1, 0.4, 0.4, 0.2, 0.2, '', '', False, 1, 0.4, 0.4, 0.2, 0.2, '', '', False, 1, 0.4, 0.4, 0.2, 0.2, '', '', True, True, True, True, 0, 512, 64, False, '', 0, True, False, 'LoRA', 'cuteGirlMix4_v10(4768d15b1b67)', 0.8, 0.8, 'LoRA', 'fashionGirl_v50(718d1cd168e7)', 0.2, 0.2, 'LoRA', 'None', 0, 0, 'LoRA', 'None', 0, 0, 'LoRA', 'None', 0, 0, None, 'Refresh models', <scripts.external_code.ControlNetUnit object at 0x000001346741FB50>, <scripts.external_code.ControlNetUnit object at 0x000001346741F0A0>, '<ul>\n<li><code>CFG Scale</code> should be 2 or lower.</li>\n</ul>\n', True, True, '', '', True, 50, True, 1, 0, False, 4, 1, 'None', '<p style="margin-bottom:0.75em">Recommended settings: Sampling Steps: 80-100, Sampler: Euler a, Denoising strength: 0.8</p>', 128, 8, ['left', 'right', 'up', 'down'], 1, 0.05, 128, 4, 0, ['left', 'right', 'up', 'down'], False, False, 'positive', 'comma', 0, False, False, '', '<p style="margin-bottom:0.75em">Will upscale the image by the selected scale factor; use width and height sliders to set tile size</p>', 64, 0, 2, 1, '', 0, '', 0, '', True, False, False, False, 0, None, False, None, False, 50, '<p style="margin-bottom:0.75em">Will upscale the image depending on the selected target size type</p>', 512, 0, 8, 32, 64, 0.35, 32, 0, True, 0, False, 8, 0, 0, 2048, 2048, 2) {}
Traceback (most recent call last):
  File "G:\stable-diffusion-webui-directml\modules\call_queue.py", line 56, in f
    res = list(func(*args, **kwargs))
  File "G:\stable-diffusion-webui-directml\modules\call_queue.py", line 37, in f
    res = func(*args, **kwargs)
  File "G:\stable-diffusion-webui-directml\modules\img2img.py", line 171, in img2img
    processed = process_images(p)
  File "G:\stable-diffusion-webui-directml\modules\processing.py", line 486, in process_images
    res = process_images_inner(p)
  File "G:\stable-diffusion-webui-directml\modules\processing.py", line 577, in process_images_inner
    p.init(p.all_prompts, p.all_seeds, p.all_subseeds)
  File "G:\stable-diffusion-webui-directml\modules\processing.py", line 1022, in init
    self.init_latent = self.sd_model.get_first_stage_encoding(self.sd_model.encode_first_stage(image))
  File "G:\stable-diffusion-webui-directml\modules\sd_hijack_utils.py", line 17, in <lambda>
    setattr(resolved_obj, func_path[-1], lambda *args, **kwargs: self(*args, **kwargs))
  File "G:\stable-diffusion-webui-directml\modules\sd_hijack_utils.py", line 28, in __call__
    return self.__orig_func(*args, **kwargs)
  File "G:\stable-diffusion-webui-directml\venv\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "G:\stable-diffusion-webui-directml\repositories\stable-diffusion-stability-ai\ldm\models\diffusion\ddpm.py", line 830, in encode_first_stage
    return self.first_stage_model.encode(x)
  File "G:\stable-diffusion-webui-directml\modules\lowvram.py", line 48, in first_stage_model_encode_wrap
    return first_stage_model_encode(x)
  File "G:\stable-diffusion-webui-directml\repositories\stable-diffusion-stability-ai\ldm\models\autoencoder.py", line 83, in encode
    h = self.encoder(x)
  File "G:\stable-diffusion-webui-directml\venv\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "G:\stable-diffusion-webui-directml\extensions\multidiffusion-upscaler-for-automatic1111\scripts\vae_optimize.py", line 487, in __call__
    return self.vae_tile_forward(x)
  File "G:\stable-diffusion-webui-directml\extensions\multidiffusion-upscaler-for-automatic1111\scripts\vae_optimize.py", line 369, in wrapper
    ret = fn(*args, **kwargs)
  File "G:\stable-diffusion-webui-directml\venv\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "G:\stable-diffusion-webui-directml\extensions\multidiffusion-upscaler-for-automatic1111\scripts\vae_optimize.py", line 635, in vae_tile_forward
    tile = z[:, :, input_bbox[2]:input_bbox[3],
RuntimeError: Cannot set version_counter for inference tensor

出现这个错误之后取消vae到gpu选项再点生成还是会爆RuntimeError: Cannot set version_counter for inference tensor，只能重启webui

pkuliyi2015 commented 1 year ago

非常感谢，看到了^ω^

pkuliyi2015 commented 1 year ago

做了一点尝试性修复未必有用但还请测试一下

cbisuper commented 1 year ago

做了一点尝试性修复未必有用但还请测试一下

一小时前看到EN页面更新了就下了，目前VAE开启move to gpu不会再报inference tensor错误，但这个选项开不开最后好像还是无法输出。

100%|██████████████████████████████████████████████████████████████████████████████████| 20/20 [02:30<00:00, 7.53s/it] 100%|██████████████████████████████████████████████████████████████████████████████████| 10/10 [02:28<00:00, 14.83s/it] [Tiled VAE]: input_size: torch.Size([1, 4, 125, 100]), tile_size: 48, padding: 11 [Tiled VAE]: split to 3x2 = 6 tiles. Optimal tile size 48x48, original tile size 48x48 [Tiled VAE]: Fast mode enabled, estimating group norm parameters on 38 x 48 image Error completing request Arguments: ('task(awjut50b897senq)', 'masterpiece, best quality, 1girl, aqua eyes, black hair, closed mouth, earrings, multicolored background, hoop earrings, jewelry, looking at viewer, long hair, outdoors, solo, full body, alluring, clean, beautiful face, pure face, pale skin, sexy pose, navel,((luxury dress)),', 'sketches, (worst quality:2), (low quality:2), (normal quality:2), lowres, normal quality, ((monochrome)), ((grayscale)), skin spots, acnes, skin blemishes, bad anatomy,(long hair:1.4),DeepNegative,(fat:1.2),facing away, looking away,tilted head, lowres,bad anatomy,bad hands, text, error, missing fingers,extra digit, fewer digits, cropped, worstquality, low quality, normal quality,jpegartifacts,signature, watermark, username,blurry,bad feet,cropped,poorly drawn hands,poorly drawn face,mutation,deformed,worst quality,low quality,normal quality,jpeg artifacts,signature,watermark,extra fingers,fewer digits,extra limbs,extra arms,extra legs,malformed limbs,fused fingers,too many fingers,long neck,cross-eyed,mutated hands,polar lowres,bad body,bad proportions,gross proportions,text,error,missing fingers,missing arms,missing legs,extra digit, extra arms, extra leg, extra foot,\n', [], 20, 15, False, False, 1, 1, 11, -1.0, -1.0, 0, 0, 0, False, 500, 400, True, 0.5, 2, 'Latent', 10, 0, 0, [], 0, False, False, 1024, 1024, True, 64, 64, 32, 1, 'None', 2, False, True, False, True, True, True, 512, 48, False, False, 'none', 'None', 1, None, False, 'Scale to Fit (Inner Fit)', False, False, 64, 64, 64, 1, False, False, False, 'positive', 'comma', 0, False, False, '', 1, '', 0, '', 0, '', True, False, False, False, 0, '', 5, 24, 12.5, 1000, 'DDIM', 0, 64, 64, '', 64, 7.5, 0.42, 'DDIM', 64, 64, 1, 0, 92, True, True, True, False, False, False, 'midas_v21_small', 0, 0, 512, 512, False, False, True, True, True, False, False, 1, False, False, 2.5, 4, 0, False, 0, 1, False, False, 'u2net', False, False, False, False) {} Traceback (most recent call last): File "E:\AICyber\stable-diffusion-webui\modules\call_queue.py", line 56, in f res = list(func(*args, kwargs)) File "E:\AICyber\stable-diffusion-webui\modules\call_queue.py", line 37, in f res = func(*args, *kwargs) File "E:\AICyber\stable-diffusion-webui\modules\txt2img.py", line 56, in txt2img processed = process_images(p) File "E:\AICyber\stable-diffusion-webui\modules\processing.py", line 486, in process_images res = process_images_inner(p) File "E:\AICyber\stable-diffusion-webui\modules\processing.py", line 634, in process_images_inner x_samples_ddim = [decode_first_stage(p.sd_model, samples_ddim[i:i+1].to(dtype=devices.dtype_vae))[0].cpu() for i in range(samples_ddim.size(0))] File "E:\AICyber\stable-diffusion-webui\modules\processing.py", line 634, in x_samples_ddim = [decode_first_stage(p.sd_model, samples_ddim[i:i+1].to(dtype=devices.dtype_vae))[0].cpu() for i in range(samples_ddim.size(0))] File "E:\AICyber\stable-diffusion-webui\modules\processing.py", line 423, in decode_first_stage x = model.decode_first_stage(x) File "E:\AICyber\stable-diffusion-webui\modules\sd_hijack_utils.py", line 17, in setattr(resolved_obj, func_path[-1], lambda args, kwargs: self(*args, kwargs)) File "E:\AICyber\stable-diffusion-webui\modules\sd_hijack_utils.py", line 28, in call return self.__orig_func(*args, *kwargs) File "E:\AICyber\stable-diffusion-webui\Python3.10\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context return func(args, kwargs) File "E:\AICyber\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\models\diffusion\ddpm.py", line 826, in decode_first_stage return self.first_stage_model.decode(z) File "E:\AICyber\stable-diffusion-webui\modules\lowvram.py", line 52, in first_stage_model_decode_wrap return first_stage_model_decode(z) File "E:\AICyber\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\models\autoencoder.py", line 90, in decode dec = self.decoder(z) File "E:\AICyber\stable-diffusion-webui\Python3.10\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl return forward_call(*input, kwargs) File "E:\AICyber\stable-diffusion-webui\scripts\vae_optimize.py", line 488, in call return self.vae_tile_forward(x) File "E:\AICyber\stable-diffusion-webui\scripts\vae_optimize.py", line 370, in wrapper ret = fn(*args, *kwargs) File "E:\AICyber\stable-diffusion-webui\Python3.10\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context return func(args, kwargs) File "E:\AICyber\stable-diffusion-webui\scripts\vae_optimize.py", line 668, in vae_tile_forward if self.estimate_group_norm(downsampled_z, estimate_task_queue, color_fix=self.color_fix): File "E:\AICyber\stable-diffusion-webui\Python3.10\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context return func(*args, **kwargs) File "E:\AICyber\stable-diffusion-webui\scripts\vae_optimize.py", line 583, in estimate_group_norm tile = group_norm_func(tile) File "E:\AICyber\stable-diffusion-webui\scripts\vae_optimize.py", line 462, in group_norm_func return custom_group_norm(x, 32, mean, var, weight, bias, 1e-6) File "E:\AICyber\stable-diffusion-webui\scripts\vae_optimize.py", line 332, in custom_group_norm out = F.batch_norm(input_reshaped, mean, var, weight=None, bias=None, File "E:\AICyber\stable-diffusion-webui\Python3.10\lib\site-packages\torch\nn\functional.py", line 2450, in batch_norm return torch.batch_norm( RuntimeError: shape '[1, 32, 1, 1, 1]' is invalid for input of size 0

qwerkilo commented 1 year ago

将 VAE 移放到 GPU这个选项打开没有问题了，但是色块的问题还在，不知道什么原因，我可以出图没有报错

qwerkilo commented 1 year ago

komeiji-renshi commented 1 year ago

100%|████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:06<00:00, 1.34s/it] Total progress: 10it [05:15, 31.54s/it] [Tiled Diffusion] upscaling image with R-ESRGAN 4x+... Tile 1/15 Tile 2/15 Tile 3/15 Tile 4/15 Tile 5/15 Tile 6/15 Tile 7/15 Tile 8/15 Tile 9/15 Tile 10/15 Tile 11/15 Tile 12/15 Tile 13/15 Tile 14/15 Tile 15/15 [Tiled VAE]: input_size: torch.Size([1, 3, 1152, 672]), tile_size: 512, padding: 32 [Tiled VAE]: split to 3x2 = 6 tiles. Optimal tile size 320x384, original tile size 512x512 [Tiled VAE]: Fast mode enabled, estimating group norm parameters on 298 x 512 image Error completing request Arguments: ('task(1k2jtpaacc1d7x3)', 0, '((masterpiece,best quality))', 'EasyNegative, extra fingers,fewer fingers,', [], <PIL.Image.Image image mode=RGBA size=448x768 at 0x25222962CB0>, None, None, None, None, None, None, 20, 15, 4, 0, 1, False, False, 1, 1, 7, 1.5, 0.2, 1949458944.0, -1.0, 0, 0, 0, False, 880, 512, 0, 0, 32, 0, '', '', '', [], 0, True, 'MultiDiffusion', False, True, 1024, 1024, 64, 64, 32, 1, 'R-ESRGAN 4x+', 1.5, False, False, 1, False, 1, 0.4, 0.4, 0.2, 0.2, '', '', False, 1, 0.4, 0.4, 0.2, 0.2, '', '', False, 1, 0.4, 0.4, 0.2, 0.2, '', '', False, 1, 0.4, 0.4, 0.2, 0.2, '', '', False, 1, 0.4, 0.4, 0.2, 0.2, '', '', False, 1, 0.4, 0.4, 0.2, 0.2, '', '', False, 1, 0.4, 0.4, 0.2, 0.2, '', '', False, 1, 0.4, 0.4, 0.2, 0.2, '', '', True, True, True, True, 0, 512, 64, False, 'none', 'None', 1, None, False, 'Scale to Fit (Inner Fit)', False, False, 64, 64, 64, 1, '

CFG Scale should be 2 or lower.

\n', True, True, '', '', True, 50, True, 1, 0, False, 4, 1, '

Recommended settings: Sampling Steps: 80-100, Sampler: Euler a, Denoising strength: 0.8

', 128, 8, ['left', 'right', 'up', 'down'], 1, 0.05, 128, 4, 0, ['left', 'right', 'up', 'down'], False, False, 'positive', 'comma', 0, False, False, '', '

Will upscale the image by the selected scale factor; use width and height sliders to set tile size

', 64, 0, 2, 1, '', 0, '', 0, '', True, False, False, False, 0, 0, 0, 512, 512, False, False, True, True, True, False, False, 1, False, False, 2.5, 4, 0, False, 0, 1, False, False, 'u2net', False, False, False, False) {} Traceback (most recent call last): File "F:\stable-diffusion-webui\modules\call_queue.py", line 56, in f res = list(func(*args, kwargs)) File "F:\stable-diffusion-webui\modules\call_queue.py", line 37, in f res = func(*args, *kwargs) File "F:\stable-diffusion-webui\modules\img2img.py", line 169, in img2img processed = process_images(p) File "F:\stable-diffusion-webui\modules\processing.py", line 486, in process_images res = process_images_inner(p) File "F:\stable-diffusion-webui\modules\processing.py", line 579, in process_images_inner p.init(p.all_prompts, p.all_seeds, p.all_subseeds) File "F:\stable-diffusion-webui\modules\processing.py", line 1013, in init self.init_latent = self.sd_model.get_first_stage_encoding(self.sd_model.encode_first_stage(image)) File "F:\stable-diffusion-webui\modules\sd_hijack_utils.py", line 17, in setattr(resolved_obj, func_path[-1], lambda args, kwargs: self(*args, kwargs)) File "F:\stable-diffusion-webui\modules\sd_hijack_utils.py", line 28, in call return self.__orig_func(*args, *kwargs) File "F:\stable-diffusion-webui\Python3.10\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context return func(args, kwargs) File "F:\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\models\diffusion\ddpm.py", line 830, in encode_first_stage return self.first_stage_model.encode(x) File "F:\stable-diffusion-webui\modules\lowvram.py", line 48, in first_stage_model_encode_wrap return first_stage_model_encode(x) File "F:\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\models\autoencoder.py", line 83, in encode h = self.encoder(x) File "F:\stable-diffusion-webui\Python3.10\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl return forward_call(*input, kwargs) File "F:\stable-diffusion-webui\extensions\multidiffusion-upscaler-for-automatic1111\scripts\vae_optimize.py", line 488, in call return self.vae_tile_forward(x) File "F:\stable-diffusion-webui\extensions\multidiffusion-upscaler-for-automatic1111\scripts\vae_optimize.py", line 370, in wrapper ret = fn(*args, *kwargs) File "F:\stable-diffusion-webui\Python3.10\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context return func(args, kwargs) File "F:\stable-diffusion-webui\extensions\multidiffusion-upscaler-for-automatic1111\scripts\vae_optimize.py", line 668, in vae_tile_forward if self.estimate_group_norm(downsampled_z, estimate_task_queue, color_fix=self.color_fix): File "F:\stable-diffusion-webui\Python3.10\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context return func(*args, **kwargs) File "F:\stable-diffusion-webui\extensions\multidiffusion-upscaler-for-automatic1111\scripts\vae_optimize.py", line 583, in estimate_group_norm tile = group_norm_func(tile) File "F:\stable-diffusion-webui\extensions\multidiffusion-upscaler-for-automatic1111\scripts\vae_optimize.py", line 462, in group_norm_func return custom_group_norm(x, 32, mean, var, weight, bias, 1e-6) File "F:\stable-diffusion-webui\extensions\multidiffusion-upscaler-for-automatic1111\scripts\vae_optimize.py", line 332, in custom_group_norm out = F.batch_norm(input_reshaped, mean, var, weight=None, bias=None, File "F:\stable-diffusion-webui\Python3.10\lib\site-packages\torch\nn\functional.py", line 2450, in batch_norm return torch.batch_norm( RuntimeError: shape '[1, 32, 1, 1, 1]' is invalid for input of size 0

本人也是A卡用户，RX6700 XT，12G显存，勾选move VAE to GPU 就会报上面这个和issue作者一模一样的错，不勾选的话想生成大一点的图就会爆 RuntimeError: Could not allocate tensor with 891813888 bytes. There is not enough GPU video memory available! 这样的错

pkuliyi2015 commented 1 year ago

这个问题我还在研究。最近有朋友在用A卡帮忙测试，具体是什么情况我还不是很清楚。

pkuliyi2015 commented 1 year ago

缺块儿问题也有人反馈了。现在依然不清楚原因

cbisuper commented 1 year ago

缺块儿问题也有人反馈了。现在依然不清楚原因

最近模块更新的很勤快，麻烦你了

sorryhorizonTT commented 1 year ago

加油大佬

sorryhorizonTT commented 1 year ago

群里有人用Ubuntu下的a卡好像没问题 win下的都有这个错

RicarodC commented 1 year ago

RuntimeError: shape '[1, 32, 1, 1, 1]' is invalid for input of size 0 AMD的卡，无论勾不勾选move VAE to GPU都会出现，还有救嘛...

pkuliyi2015 commented 1 year ago

我想修这个问题好多天了，问题是我没得AMD卡呀，也没有懂哥是玩AMD卡的…有没有哪个用AMD又会python调试的佬支援一下？

qwerkilo commented 1 year ago

不急吧，amd下一版rocm可能会上win，到时候估计可以跑，directml问题还是挺多的（可以单独开个dml的分支，我也可以配合测试）

wzddl commented 1 year ago

这个问题我还在研究。最近有朋友在用A卡帮忙测试，具体是什么情况我还不是很清楚。

我也是A卡 AMD 6650XT 8G，有测试需要，可以提供调试帮助

qwerkilo commented 1 year ago

我想修这个问题好多天了，问题是我没得AMD卡呀，也没有懂哥是玩AMD卡的…有没有哪个用AMD又会python调试的佬支援一下？

我想修这个问题好多天了，问题是我没得AMD卡呀，也没有懂哥是玩AMD卡的…有没有哪个用AMD又会python调试的佬支援一下？

好像有人跑成功了，改一下Decoder Tile Size设置为128就可以出图，我晚点试试 https://github.com/pkuliyi2015/multidiffusion-upscaler-for-automatic1111/issues/134#issuecomment-1513365290

确认了应该是可以的，用windows+amd(dml)的老哥们可以试试这个方法:https://github.com/lshqqytiger/stable-diffusion-webui-directml/discussions/84#discussioncomment-5642611

我自己也跑了一下，确实没问题了

wzddl commented 1 year ago

我想修这个问题好多天了，问题是我没得AMD卡呀，也没有懂哥是玩AMD卡的…有没有哪个用AMD又会python调试的佬支援一下？

我想修这个问题好多天了，问题是我没得AMD卡呀，也没有懂哥是玩AMD卡的…有没有哪个用AMD又会python调试的佬支援一下？

好像有人跑成功了，改一下Decoder Tile Size设置为128就可以出图，我晚点试试 https://github.com/pkuliyi2015/multidiffusion-upscaler-for-automatic1111/issues/134#issuecomment-1513365290

确认了应该是可以的，用windows+amd(dml)的老哥们可以试试这个方法:https://github.com/lshqqytiger/stable-diffusion-webui-directml/discussions/84#discussioncomment-5642611

我自己也跑了一下，确实没问题了

我依然有问题提示显存不足 RX6650XT 8G

GRFTSOL commented 1 year ago

我想修这个问题好多天了，问题是我没得AMD卡呀，也没有懂哥是玩AMD卡的…有没有哪个用AMD又会python调试的佬支援一下？

我想修这个问题好多天了，问题是我没得AMD卡呀，也没有懂哥是玩AMD卡的…有没有哪个用AMD又会python调试的佬支援一下？

好像有人跑成功了，改一下Decoder Tile Size设置为128就可以出图，我晚点试试 #134 (comment)

确认了应该是可以的，用windows+amd(dml)的老哥们可以试试这个方法:lshqqytiger/stable-diffusion-webui-directml#84 (reply in thread)

我自己也跑了一下，确实没问题了

还是不行，改成128的话8g显存就不够了:(，我也是6650xt

更新一下，cfg降到4，可以以decoder tile size 128出图了:) 但是只画成功了两张，就再也没成功过，换模型也不行，继续降cfg到不能画正常图也不行

qwerkilo commented 1 year ago

我想修这个问题好多天了，问题是我没得AMD卡呀，也没有懂哥是玩AMD卡的…有没有哪个用AMD又会python调试的佬支援一下？

我想修这个问题好多天了，问题是我没得AMD卡呀，也没有懂哥是玩AMD卡的…有没有哪个用AMD又会python调试的佬支援一下？

好像有人跑成功了，改一下Decoder Tile Size设置为128就可以出图，我晚点试试 #134 (comment) 确认了应该是可以的，用windows+amd(dml)的老哥们可以试试这个方法:lshqqytiger/stable-diffusion-webui-directml#84 (reply in thread) 我自己也跑了一下，确实没问题了

还是不行，改成128的话8g显存就不够了:(，我也是6650xt

更新一下，cfg降到4，可以以decoder tile size 128出图了:) 但是只画成功了两张，就再也没成功过，换模型也不行，继续降cfg到不能画正常图也不行

tiled vae的编码设置1024 解码设置128，快速编码不要勾选，Tiled Diffusion 的lantent长和宽设置128（overlap我一般设置64，默认48应该也可以我好像也成功出图过），gpu勾上，其他默认，然后你试试，我是这样没问题可以出图（我的是768x512放大两倍，用这个参数不会出现色块问题，我设置其他参数会出色块）batch size用1，拉高了速度没变化，而且容易提示爆显存

pkuliyi2015 commented 1 year ago

这个的原因已经有大佬研究出来了，就是directml的问题。Direct ML会在大张量的情况下，把左下角错误填充成零。因此，可能的解决方案只能等direct ML更新。

另外大佬也指出了，在某些独特的尺寸下，这个问题不会发生，因此你们可以试试多种编解码尺寸。

GRFTSOL commented 1 year ago

这个的原因已经有大佬研究出来了，就是directml的问题。Direct ML会在大张量的情况下，把左下角错误填充成零。因此，可能的解决方案只能等direct ML更新。

另外大佬也指出了，在某些独特的尺寸下，这个问题不会发生，因此你们可以试试多种编解码尺寸。

原来是这样，感谢解答，那我还是远程家里电脑去linux下用了

GRFTSOL commented 1 year ago

我想修这个问题好多天了，问题是我没得AMD卡呀，也没有懂哥是玩AMD卡的…有没有哪个用AMD又会python调试的佬支援一下？

我想修这个问题好多天了，问题是我没得AMD卡呀，也没有懂哥是玩AMD卡的…有没有哪个用AMD又会python调试的佬支援一下？

好像有人跑成功了，改一下Decoder Tile Size设置为128就可以出图，我晚点试试 #134 (comment) 确认了应该是可以的，用windows+amd(dml)的老哥们可以试试这个方法:lshqqytiger/stable-diffusion-webui-directml#84 (reply in thread) 我自己也跑了一下，确实没问题了

还是不行，改成128的话8g显存就不够了:(，我也是6650xt 更新一下，cfg降到4，可以以decoder tile size 128出图了:) 但是只画成功了两张，就再也没成功过，换模型也不行，继续降cfg到不能画正常图也不行

tiled vae的编码设置1024 解码设置128，快速编码不要勾选，Tiled Diffusion 的lantent长和宽设置128（overlap我一般设置64，默认48应该也可以我好像也成功出图过），gpu勾上，其他默认，然后你试试，我是这样没问题可以出图（我的是768x512放大两倍，用这个参数不会出现色块问题，我设置其他参数会出色块）batch size用1，拉高了速度没变化，而且容易提示爆显存

可能是我是a卡的问题吧，8g显存用和你一样的设置也不行，图生图和文生图都是爆显存，只有某些特定分辨率能跑出来，但是由于开的时间很长了，出图也是古神:( 下次一定整换双卡