Runtime Error: The size of tensor a (64) must match the size of tensor b (128) at non-singleton dimension 2

James0MEGA commented 1 year ago

[StableSR] Target image size: 1024x2048 [Tiled Diffusion] ControlNet found, support is enabled. MixtureOfDiffusers Sampling: : 0it [00:00, ?it/s]Mixture of Diffusers hooked into 'Euler a' sampler, Tile size: 64x64, Tile batches: 21, Batch size: 1. (ext: ContrlNet) [Tiled VAE]: input_size: torch.Size([1, 3, 2048, 1024]), tile_size: 1024, padding: 32 [Tiled VAE]: split to 2x1 = 2 tiles. Optimal tile size 960x992, original tile size 1024x1024 [Tiled VAE]: Fast mode enabled, estimating group norm parameters on 512 x 1024 image [Tiled VAE]: Executing Encoder Task Queue: 100%|█████████████████████████████████████| 182/182 [00:02<00:00, 69.48it/s] [Tiled VAE]: Done in 18.380s, max VRAM alloc 3158.800 MB███████████▏ | 92/182 [00:01<00:01, 69.33it/s] 0%| | 0/20 [00:01<?, ?it/s] Error completing request | 0/20 [00:00<?, ?it/s] Arguments: ('task(76hs6hmz13knsse)', 0, '', '', [], <PIL.Image.Image image mode=RGBA size=512x1024 at 0x23DEA0BD690>, None, None, None, None, None, None, 20, 0, 4, 0, 0, False, False, 1, 1, 2, 1.5, 0.75, -1.0, -1.0, 0, 0, 0, False, 0, 1024, 512, 1, 0, 0, 32, 0, '', '', '', [], 15, '\n

\n

Estimated VRAM usage: 6675.32 MB / 8192 MB (81.49%)

\n

(4891 MB system + 1622.11 MB used)

\n

\n ', False, {'ad_model': 'face_yolov8n.pt', 'ad_prompt': '', 'ad_negative_prompt': '', 'ad_conf': 30, 'ad_dilate_erode': 32, 'ad_x_offset': 0, 'ad_y_offset': 0, 'ad_mask_blur': 4, 'ad_denoising_strength': 0.4, 'ad_inpaint_full_res': True, 'ad_inpaint_full_res_padding': 0, 'ad_use_inpaint_width_height': False, 'ad_inpaint_width': 512, 'ad_inpaint_height': 512, 'ad_use_steps': False, 'ad_steps': 28, 'ad_use_cfg_scale': False, 'ad_cfg_scale': 7, 'ad_controlnet_model': 'None', 'ad_controlnet_weight': 1}, {'ad_model': 'None', 'ad_prompt': '', 'ad_negative_prompt': '', 'ad_conf': 30, 'ad_dilate_erode': 32, 'ad_x_offset': 0, 'ad_y_offset': 0, 'ad_mask_blur': 4, 'ad_denoising_strength': 0.4, 'ad_inpaint_full_res': True, 'ad_inpaint_full_res_padding': 0, 'ad_use_inpaint_width_height': False, 'ad_inpaint_width': 512, 'ad_inpaint_height': 512, 'ad_use_steps': False, 'ad_steps': 28, 'ad_use_cfg_scale': False, 'ad_cfg_scale': 7, 'ad_controlnet_model': 'None', 'ad_controlnet_weight': 1}, False, 'keyword prompt', 'keyword1, keyword2', 'None', 'textual inversion first', 'None', '0.7', 'None', True, 'Mixture of Diffusers', False, True, 1024, 1024, 64, 64, 32, 1, 'None', 2, False, 10, 1, 1, 64, False, False, False, False, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, True, 1024, 128, True, True, True, False, False, '', 0, False, 7, 100, 'Constant', 0, 'Constant', 0, 4, <controlnet.py.UiControlNetUnit object at 0x0000023DEA0BDF90>, <controlnet.py.UiControlNetUnit object at 0x0000023DEA0BC9D0>, <controlnet.py.UiControlNetUnit object at 0x0000023DEA0BC6A0>, <controlnet.py.UiControlNetUnit object at 0x0000023DEA5372E0>, False, 1, 0.15, False, 'OUT', ['OUT'], 5, 0, 'Bilinear', False, 'Pooling Max', False, 'Lerp', '', '', False, False, False, 'Horizontal', '1,1', '0.2', False, False, False, 'Attention', False, '0', '0', '0.4', None, False, '1:1,1:2,1:2', '0:0,0:0,0:1', '0.2,0.8,0.8', 150, 0.2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, '', [], '', True, False, False, '

CFG Scale should be 2 or lower.

\n', True, True, '', '', True, 50, True, 1, 0, False, 4, 0.5, 'Linear', 'None', '

Recommended settings: Sampling Steps: 80-100, Sampler: Euler a, Denoising strength: 0.8

', 128, 8, ['left', 'right', 'up', 'down'], 1, 0.05, 128, 4, 0, ['left', 'right', 'up', 'down'], False, None, None, '', '', '', '', 'Auto rename', {'label': 'Upload avatars config'}, 'Open outputs directory', 'Export to WebUI style', True, {'label': 'Presets'}, {'label': 'QC preview'}, '', [], 'Select', 'QC scan', 'Show pics', None, False, False, 'positive', 'comma', 0, False, False, '', '

Will upscale the image by the selected scale factor; use width and height sliders to set tile size

', 64, 0, 2, 1, '', [], 0, '', [], 0, '', [], True, False, False, False, 0, False, 'Blur First V1', 0.25, 10, 10, 10, 10, 1, False, '', '', 0.5, 1, False, None, False, None, False, None, False, None, False, 50, 'stablesr_webui_sd-v2-1-512-ema-000117.ckpt', 2, True, 'Wavelet', False, '

Will upscale the image depending on the selected target size type

', 512, 8, 32, 64, 0.35, 32, 0, True, 0, False, 8, 0, 0, 2048, 2048, 2) {} Traceback (most recent call last): File "D:\SD\stable-diffusion-webui\modules\call_queue.py", line 57, in f res = list(func(*args, kwargs)) File "D:\SD\stable-diffusion-webui\modules\call_queue.py", line 37, in f res = func(*args, *kwargs) File "D:\SD\stable-diffusion-webui\modules\img2img.py", line 180, in img2img processed = modules.scripts.scripts_img2img.run(p, args) File "D:\SD\stable-diffusion-webui\modules\scripts.py", line 408, in run processed = script.run(p, script_args) File "D:\SD\stable-diffusion-webui\extensions\sd-webui-stablesr\scripts\stablesr.py", line 248, in run result: Processed = processing.process_images(p) File "D:\SD\stable-diffusion-webui\modules\processing.py", line 526, in process_images res = process_images_inner(p) File "D:\SD\stable-diffusion-webui\extensions\sd-webui-controlnet\scripts\batch_hijack.py", line 42, in processing_process_images_hijack return getattr(processing, '__controlnet_original_process_images_inner')(p, args, kwargs) File "D:\SD\stable-diffusion-webui\modules\processing.py", line 680, in process_images_inner samples_ddim = p.sample(conditioning=c, unconditional_conditioning=uc, seeds=seeds, subseeds=subseeds, subseed_strength=p.subseed_strength, prompts=prompts) File "D:\SD\stable-diffusion-webui\extensions\sd-webui-stablesr\scripts\stablesr.py", line 223, in sample_custom samples = sampler.sample(p, x, conditioning, unconditional_conditioning, image_conditioning=p.image_conditioning) File "D:\SD\stable-diffusion-webui\modules\sd_samplers_kdiffusion.py", line 377, in sample samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args={ File "D:\SD\stable-diffusion-webui\modules\sd_samplers_kdiffusion.py", line 251, in launch_sampling return func() File "D:\SD\stable-diffusion-webui\modules\sd_samplers_kdiffusion.py", line 377, in samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args={ File "D:\SD\stable-diffusion-webui\venv\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context return func(*args, kwargs) File "D:\SD\stable-diffusion-webui\repositories\k-diffusion\k_diffusion\sampling.py", line 145, in sample_euler_ancestral denoised = model(x, sigmas[i] * s_in, *extra_args) File "D:\SD\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl return forward_call(input, kwargs) File "D:\SD\stable-diffusion-webui\modules\sd_samplers_kdiffusion.py", line 135, in forward x_out = self.inner_model(x_in, sigma_in, cond=make_condition_dict([cond_in], image_cond_in)) File "D:\SD\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl return forward_call(*input, kwargs) File "D:\SD\stable-diffusion-webui\repositories\k-diffusion\k_diffusion\external.py", line 112, in forward eps = self.get_eps(input * c_in, self.sigma_to_t(sigma), *kwargs) File "D:\SD\stable-diffusion-webui\repositories\k-diffusion\k_diffusion\external.py", line 138, in get_eps return self.inner_model.apply_model(args, kwargs) File "D:\SD\stable-diffusion-webui\venv\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context return func(*args, kwargs) File "D:\SD\stable-diffusion-webui\extensions\multidiffusion-upscaler-for-automatic1111\tile_utils\utils.py", line 243, in wrapper return fn(*args, *kwargs) File "D:\SD\stable-diffusion-webui\extensions\multidiffusion-upscaler-for-automatic1111\tile_methods\mixtureofdiffusers.py", line 126, in apply_model_hijack x_tile_out = shared.sd_model.apply_model_original_md(x_tile, t_tile, c_tile) File "D:\SD\stable-diffusion-webui\modules\sd_hijack_utils.py", line 17, in setattr(resolved_obj, func_path[-1], lambda args, kwargs: self(*args, kwargs)) File "D:\SD\stable-diffusion-webui\modules\sd_hijack_utils.py", line 28, in call return self.__orig_func(args, kwargs) File "D:\SD\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\models\diffusion\ddpm.py", line 858, in apply_model x_recon = self.model(x_noisy, t, cond) File "D:\SD\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl return forward_call(input, kwargs) File "D:\SD\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\models\diffusion\ddpm.py", line 1335, in forward out = self.diffusion_model(x, t, context=cc) File "D:\SD\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl return forward_call(input, kwargs) File "D:\SD\stable-diffusion-webui\extensions\sd-webui-stablesr\scripts\stablesr.py", line 97, in unet_forward return getattr(unet, FORWARD_CACHE_NAME)(x, timesteps, context, y, kwargs) File "D:\SD\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\modules\diffusionmodules\openaimodel.py", line 797, in forward h = module(h, emb, context) File "D:\SD\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl return forward_call(input, kwargs) File "D:\SD\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\modules\diffusionmodules\openaimodel.py", line 82, in forward x = layer(x, emb) File "D:\SD\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl return forward_call(*input, *kwargs) File "D:\SD\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\modules\diffusionmodules\openaimodel.py", line 249, in forward return checkpoint( File "D:\SD\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\modules\diffusionmodules\util.py", line 121, in checkpoint return CheckpointFunction.apply(func, len(inputs), args) File "D:\SD\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\modules\diffusionmodules\util.py", line 136, in forward output_tensors = ctx.run_function(ctx.input_tensors) File "D:\SD\stable-diffusion-webui\extensions\sd-webui-stablesr\srmodule\spade.py", line 141, in resblock._forward = lambda x, timesteps, resblock=resblock, spade=self.input_blocks[i]: dual_resblock_forward(resblock, x, timesteps, spade, get_struct_cond) File "D:\SD\stable-diffusion-webui\extensions\sd-webui-stablesr\srmodule\spade.py", line 80, in dual_resblock_forward h = spade(h, get_struct_cond()) File "D:\SD\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl return forward_call(input, kwargs) File "D:\SD\stable-diffusion-webui\extensions\sd-webui-stablesr\srmodule\spade.py", line 34, in forward return checkpoint( File "D:\SD\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\modules\diffusionmodules\util.py", line 121, in checkpoint return CheckpointFunction.apply(func, len(inputs), args) File "D:\SD\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\modules\diffusionmodules\util.py", line 136, in forward output_tensors = ctx.run_function(ctx.input_tensors) File "D:\SD\stable-diffusion-webui\extensions\sd-webui-stablesr\srmodule\spade.py", line 52, in _forward out *= (1 + self.mlp_gamma(actv).repeat_interleave(repeat_factor, dim=0)) RuntimeError: The size of tensor a (64) must match the size of tensor b (128) at non-singleton dimension 2

Hello,

I had a similar issue back when I first started using StableDiffusion. IIRC it was an Img2Img dimension mismatch that I fixed by matching the sliders to the dimension of the input image.

I have no idea what the issue could be here, however. This was my trial run with StableSR and I've matched all the relevant settings to those provided in the guide.

Euler a 20 steps W 512 x H 1024 (matches input image) CFG Scale 2

Latent Tile Dimensions are both 64 Overlap 32 Batch Size 1 Scale Factor 2 No Upscaler

TiledVAE Encoder Tile Size 1024 Decoder Tile Size 128

StableSR Scale Factor 2

James0MEGA commented 1 year ago

Addendum: I've tried several other upscaling extensions/methods and they work as expected, the error only occurs while trying to use StableSR.

pkuliyi2015 commented 1 year ago

Two things to be aware of: 1) Please update the Tiled Diffusion to the latest version 2) Please do not use the "Latent Upscale" option in webui.

If the problem still exists, please tell me whether the log changes or not. I'm also confused about this.

James0MEGA commented 1 year ago

I'll give it a shot, thank you for the timely response!

James0MEGA commented 1 year ago

It worked!

I'm guessing it was the update, as last time the UI got wonky on me and I had to close my python instance mid-update.

Appreciate the help, bruh

pkuliyi2015 / sd-webui-stablesr

Runtime Error: The size of tensor a (64) must match the size of tensor b (128) at non-singleton dimension 2 #26