John-WL / sd-webui-inpaint-difference

A1111 extension to find the inpaint mask to use based on the difference between two images.
MIT License
52 stars 4 forks source link

SDNEXT: Expected 3D (unbatched) or 4D (batched) input to conv2d, but got input of size: [0] #27

Open RGX650 opened 2 months ago

RGX650 commented 2 months ago

01:28:47-304120 ERROR gradio call: RuntimeError ╭─────────────────────── Traceback (most recent call last) ───────────────────────╮ │ C:\AI\SDNEXT\automatic\modules\call_queue.py:31 in f │ │ │ │ 30 │ │ │ try: │ │ ❱ 31 │ │ │ │ res = func(*args, kwargs) │ │ 32 │ │ │ │ progress.record_results(id_task, res) │ │ in img2img:264 │ │ │ │ C:\AI\SDNEXT\automatic\modules\processing.py:193 in process_images │ │ │ │ 192 │ │ │ with context_hypertile_vae(p), context_hypertile_unet(p): │ │ ❱ 193 │ │ │ │ processed = process_images_inner(p) │ │ 194 │ │ │ │ C:\AI\SDNEXT\automatic\extensions-builtin\sd-webui-controlnet\scripts\batch_hij │ │ │ │ 41 │ │ │ # we are not in batch mode, fallback to original function │ │ ❱ 42 │ │ │ return getattr(processing, '__controlnet_original_process_ima │ │ 43 │ │ │ │ C:\AI\SDNEXT\automatic\modules\processing.py:264 in process_images_inner │ │ │ │ 263 │ │ │ with devices.autocast(): │ │ ❱ 264 │ │ │ │ p.init(p.all_prompts, p.all_seeds, p.all_subseeds) │ │ 265 │ │ extra_network_data = None │ │ │ │ ... 9 frames hidden ... │ │ │ │ C:\AI\SDNEXT\automatic\repositories\ldm\modules\diffusionmodules\model.py:523 i │ │ │ │ 522 │ │ # downsampling │ │ ❱ 523 │ │ hs = [self.conv_in(x)] │ │ 524 │ │ for i_level in range(self.num_resolutions): │ │ │ │ C:\AI\SDNEXT\automatic\venv\lib\site-packages\torch\nn\modules\module.py:1511 i │ │ │ │ 1510 │ │ else: │ │ ❱ 1511 │ │ │ return self._call_impl(*args, *kwargs) │ │ 1512 │ │ │ │ C:\AI\SDNEXT\automatic\venv\lib\site-packages\torch\nn\modules\module.py:1520 i │ │ │ │ 1519 │ │ │ │ or _global_forward_hooks or _global_forward_pre_hooks): │ │ ❱ 1520 │ │ │ return forward_call(args, kwargs) │ │ 1521 │ │ │ │ C:\AI\SDNEXT\automatic\venv\lib\site-packages\torch\nn\modules\conv.py:460 in f │ │ │ │ 459 │ def forward(self, input: Tensor) -> Tensor: │ │ ❱ 460 │ │ return self._convforward(input, self.weight, self.bias) │ │ 461 │ │ │ │ C:\AI\SDNEXT\automatic\venv\lib\site-packages\torch\nn\modules\conv.py:456 in │ │ │ │ 455 │ │ │ │ │ │ │ _pair(0), self.dilation, self.groups) │ │ ❱ 456 │ │ return F.conv2d(input, weight, bias, self.stride, │ │ 457 │ │ │ │ │ │ self.padding, self.dilation, self.groups) │ ╰────────────────────────────────────────────────────────────────────────────────╯ RuntimeError: Expected 3D (unbatched) or 4D (batched) input to conv2d, but got input of size: [0]

John-WL commented 2 months ago

I don't see a function hijacked by the extension in the error message. Can you help me understand how this is related to the extension? Am I missing something here?

RGX650 commented 2 months ago
Untitled

the inpaint-difference extention is computing correctly the diff between the 2 provided images. but when i click Generate:

10:52:00-360204 DEBUG Server: alive=True jobs=3 requests=141 uptime=201 memory=4.26/63.2 backend=Backend.ORIGINAL state=idle 10:53:24-702967 ERROR Image processing unknown mode: 6.0 10:53:24-735203 DEBUG Sampler: sampler="DPM++ 2M" config={'scheduler': 'karras', 'brownian_noise': False} 10:53:25-006878 ERROR Exception: Expected 3D (unbatched) or 4D (batched) input to conv2d, but got input of size: [0] 10:53:25-007878 ERROR Arguments: args=('task(nivwylmp9ee8sxu)', 6.0, '', '', [], None, None, None, None, None, None, None, 20, 11, 4, 1, 1, True, False, False, 1, 1, 6, 6, 0.7, 0, 1, 0, 1, 0.5, -1.0, -1.0, 0, 0, 0, 1, 512, 512, 1, 1, 'None', 0, 32, 0, None, '', '', '', 0, 0, 0, 0, False, 4, 0.95, False, 0.6, 1, '#000000', 0, [], 0, UiControlNetUnit(enabled=False, module='none', model='None', weight=1, image=None, resize_mode='Crop and Resize', low_vram=False, processor_res=-1, threshold_a=-1, threshold_b=-1, guidance_start=0, guidance_end=1, pixel_perfect=False, control_mode='Balanced', inpaint_crop_input_image=False, hr_option='Both', save_detected_map=True, advanced_weighting=None), UiControlNetUnit(enabled=False, module='none', model='None', weight=1, image=None, resize_mode='Crop and Resize', low_vram=False, processor_res=-1, threshold_a=-1, threshold_b=-1, guidance_start=0, guidance_end=1, pixel_perfect=False, control_mode='Balanced', inpaint_crop_input_image=False, hr_option='Both', save_detected_map=True, advanced_weighting=None), UiControlNetUnit(enabled=False, module='none', model='None', weight=1, image=None, resize_mode='Crop and Resize', low_vram=False, processor_res=-1, threshold_a=-1, threshold_b=-1, guidance_start=0, guidance_end=1, pixel_perfect=False, control_mode='Balanced', inpaint_crop_input_image=False, hr_option='Both', save_detected_map=True, advanced_weighting=None), False, False, {'ad_model': 'face_yolov8n.pt', 'ad_model_classes': '', 'ad_prompt': '', 'ad_negative_prompt': '', 'ad_confidence': 0.3, 'ad_mask_k_largest': 0, 'ad_mask_min_ratio': 0, 'ad_mask_max_ratio': 1, 'ad_x_offset': 0, 'ad_y_offset': 0, 'ad_dilate_erode': 4, 'ad_mask_merge_invert': 'None', 'ad_mask_blur': 4, 'ad_denoising_strength': 0.4, 'ad_inpaint_only_masked': True, 'ad_inpaint_only_masked_padding': 32, 'ad_use_inpaint_width_height': False, 'ad_inpaint_width': 512, 'ad_inpaint_height': 512, 'ad_use_steps': False, 'ad_steps': 28, 'ad_use_cfg_scale': False, 'ad_cfg_scale': 7, 'ad_use_checkpoint': False, 'ad_checkpoint': 'Use same checkpoint', 'ad_use_vae': False, 'ad_vae': 'Use same VAE', 'ad_use_sampler': False, 'ad_sampler': 'UniPC', 'ad_scheduler': 'Use same scheduler', 'ad_use_noise_multiplier': False, 'ad_noise_multiplier': 1, 'ad_use_clip_skip': False, 'ad_clip_skip': 1, 'ad_restore_face': False, 'ad_controlnet_model': 'None', 'ad_controlnet_module': 'None', 'ad_controlnet_weight': 1, 'ad_controlnet_guidance_start': 0, 'ad_controlnet_guidance_end': 1, 'is_api': ()}, {'ad_model': 'None', 'ad_model_classes': '', 'ad_prompt': '', 'ad_negative_prompt': '', 'ad_confidence': 0.3, 'ad_mask_k_largest': 0, 'ad_mask_min_ratio': 0, 'ad_mask_max_ratio': 1, 'ad_x_offset': 0, 'ad_y_offset': 0, 'ad_dilate_erode': 4, 'ad_mask_merge_invert': 'None', 'ad_mask_blur': 4, 'ad_denoising_strength': 0.4, 'ad_inpaint_only_masked': True, 'ad_inpaint_only_masked_padding': 32, 'ad_use_inpaint_width_height': False, 'ad_inpaint_width': 512, 'ad_inpaint_height': 512, 'ad_use_steps': False, 'ad_steps': 28, 'ad_use_cfg_scale': False, 'ad_cfg_scale': 7, 'ad_use_checkpoint': False, 'ad_checkpoint': 'Use same checkpoint', 'ad_use_vae': False, 'ad_vae': 'Use same VAE', 'ad_use_sampler': False, 'ad_sampler': 'UniPC', 'ad_scheduler': 'Use same scheduler', 'ad_use_noise_multiplier': False, 'ad_noise_multiplier': 1, 'ad_use_clip_skip': False, 'ad_clip_skip': 1, 'ad_restore_face': False, 'ad_controlnet_model': 'None', 'ad_controlnet_module': 'None', 'ad_controlnet_weight': 1, 'ad_controlnet_guidance_start': 0, 'ad_controlnet_guidance_end': 1, 'is_api': ()}, False, 'MultiDiffusion', False, True, 1024, 1024, 96, 96, 48, 4, 'None', 2, False, 10, 1, 1, 64, False, False, False, False, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 'DemoFusion', False, 128, 64, 4, 2, False, 10, 1, 1, 64, False, True, 3, 1, 1, True, 0.85, 0.6, 4, False, False, 3072, 192, True, True, True, False, False, False, 'LoRA', 'None', 1, 1, 'LoRA', 'None', 1, 1, 'LoRA', 'None', 1, 1, 'LoRA', 'None', 1, 1, 'LoRA', 'None', 1, 1, None, 'Refresh models', False, False, 20, 4, 4, 0.4, 0.95, 2, 2, 0.4, 0.5, False, 1, False, False, 'Use same checkpoint', 'Use same vae', 1, 0, 'None', 'None', False, 0.15, 3, 0.4, 4, 'bicubic', 0.5, 2, True, False, True, False, False, False, 'Use same checkpoint', 'Use same vae', 'txt2img-1pass', 'None', '', '', 'Use same sampler', 'Use same scheduler', 'BMAB fast', 20, 7, 0.75, 0.5, 0.1, 0.9, False, False, 'Select Model', 'None', '', '', 'Use same sampler', 'Use same scheduler', 20, 7, 0.75, 4, 0.35, False, 50, 200, 0.5, False, True, 'stretching', 'bottom', 'None', 0.85, 0.75, False, 'Use same checkpoint', 'Use same vae', True, '', '', 'Use same sampler', 'Use same scheduler', 'BMAB fast', 20, 7, 0.75, 1, 0, 0, 1, 1, 1, 1, 0, 0, 0, None, False, 1, False, '', False, False, False, True, True, 4, 3, 0.1, 1, 1, 0, 0.4, 7, False, False, False, 'Score', 1, '', '', '', '', '', '', '', '', '', '', False, 512, 512, 7, 20, 4, 'Use same checkpoint', 'Use same vae', 'Use same sampler', 'Use same scheduler', 'Only masked', 32, 'BMAB Face(Normal)', 0.4, 4, 0.35, False, 0.26, False, True, False, 'subframe', '', '', 0.4, 7, True, 2, 0.3, 0.1, 'Whole picture', 32, '', False, False, False, 0.4, 0.1, 0.9, 'Both', False, 0.3, 0, 0.1, False, 'Random', False, 'Inpaint', 0.85, 0.6, 30, False, True, 'None', 1.5, 'None', False, 'AGENCYB.TTF', 'bottom-left', 'left', '0', '#000000', '#000000', 12, 100, 0, 5, '', '', 'None', False, 0, 0, 0.6, 0.9, 0.2, 0.8, True, False, False, False, '#000000', 1, 1, 0.9, 6.0, <PIL.Image.Image image mode=RGB size=768x512 at 0x231A5C7B940>, <PIL.Image.Image image mode=RGB size=768x512 at 0x231A5C79390>, 'NONE:0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0\nALL:1,1,1,1,1,1,1,1,1,1 ,1,1,1,1,1,1,1\nINS:1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0\nIND:1,0,0 ,0,1,1,1,0,0,0,0,0,0,0,0,0,0\nINALL:1,1,1,1,1,1,1,0,0,0,0,0,0,0, 0,0,0\nMIDD:1,0,0,0,1,1,1,1,1,1,1,1,0,0,0,0,0\nOUTD:1,0,0,0,0,0, 0,0,1,1,1,1,0,0,0,0,0\nOUTS:1,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1\nO UTALL:1,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1\nALL0.5:0.5,0.5,0.5,0.5, 0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5', True, 0, 'values', '0,0.25,0.5,0.75,1', 'Block ID', 'IN05-OUT05', 'none', '', '0.5,1', 'BASE,IN00,IN01,IN02,IN03,IN04,IN05,IN06,IN07,IN08,IN09,IN10,IN1 1,M00,OUT00,OUT01,OUT02,OUT03,OUT04,OUT05,OUT06,OUT07,OUT08,OUT0 9,OUT10,OUT11', 1.0, 'black', '20', False, 'ATTNDEEPON:IN05-OUT05:attn:1\n\nATTNDEEPOFF:IN05-OUT05:attn:0\n \nPROJDEEPOFF:IN05-OUT05:proj:0\n\nXYZ:::1', False, False, False, '', 0, '', 0, '', 0, None, '', 0, '', 0, '', 0, None, '', 0, '', 0, '', 0, None, '', 0, '', 0, '', 0, None, '', 0, '', 0, '', 0, None, '', 0, '', 0, '', 0, None, '', 0, '', 0, '', 0, None, '', 0, '', 0, '', 0, None, '', 0, '', 0, '', 0, None, '', 0, '', 0, '', 0, None, '', 0, '', 0, '', 0, None, '', 0, '', 0, '', 0, None, False, 0, 0, 'from modules.processing import process_images\n\np.width = 768\np.height = 768\np.batch_size = 2\np.steps = 10\n\nreturn process_images(p)', 2, 4, 0.5, 'Linear', 'None', '&nbsp Outpainting
', 128, 8, ['left', 'right', 'up', 'down'], 1, 0.05, 128, 4, 0, ['left', 'right', 'up', 'down'], False, False, 'positive', 'comma', 0, False, False, '', '&nbsp SD Upscale
', 64, 0, 2, 0, '', [], 0, '', [], 0, '', [], False, True, False, False, False, False, 0, None, None, False, None, None, False, None, None, False, 50, 20, 0.55, 'ESRGAN_4x', 'SD15\reliberate_v20 [6b08e2c182]', 'sequence', 'outputs/temporal', 'untitled', 100, 1, False, 0, 1, 1, 2, 0.5, True, False, 1, 1, 0, 0, 0, False, 1, False, 'Euler a', 15, 0.2, 1, 0, 0, 0, True, True, [], False, 1, False, 'normal', 0, None, False, False, 0, False, 1, False, 'normal', 1, 1, 1, None, False, False, 0, False, 1, False, 'normal', None, False, False, None, False, False, 0, False, 1, False, 'normal', '#ffffff', None, False, False, 0, False, 1, False, 'normal', '', None, False, False, 0, False, 1, False, 'normal', None, 0, None, False, False, 0, False, 1, False, 'normal', 0, 50, None, False, False, 0, False, 1, False, 'normal', 'erosion', 0, None, False, False, 0, False, 1, False, 'normal', 0, 0, None, False, False, 0, False, 1, False, 'normal', 1, 1, 2, 0.5, 0, False, None, False, False, 0, False, 1, False, 'normal', None, False, False, None, False, False, 0, False, 1, False, 'normal', 0, 0, None, False, False, 0, False, 1, False, 'normal', False, False, None, False, False, 0, False, 1, False, 'normal', 0, 0, 0, 1, None, False, False, 0, 30, False, [], False, 1, False, 1, 1, 1, False, 60, False, 60, 0, False, 512, 512, False, '#000000', False, 0.5, 0, False, 0, 3, False, 1, 'mean', 0, False, '{frame}', 0, 0, 0, 0, 'sans', 16, '#ffffff', 1, 1.0, 1.0, '#000000', 1, False, False, False, 10, 'NONE:0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0\nALL:1,1,1,1,1,1,1,1,1,1 ,1,1,1,1,1,1,1\nINS:1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0\nIND:1,0,0 ,0,1,1,1,0,0,0,0,0,0,0,0,0,0\nINALL:1,1,1,1,1,1,1,0,0,0,0,0,0,0, 0,0,0\nMIDD:1,0,0,0,1,1,1,1,1,1,1,1,0,0,0,0,0\nOUTD:1,0,0,0,0,0, 0,0,1,1,1,1,0,0,0,0,0\nOUTS:1,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1\nO UTALL:1,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1\nALL0.5:0.5,0.5,0.5,0.5, 0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5', True, 0, 'values', '0,0.25,0.5,0.75,1', 'Block ID', 'IN05-OUT05', 'none', '', '0.5,1', 'BASE,IN00,IN01,IN02,IN03,IN04,IN05,IN06,IN07,IN08,IN09,IN10,IN1 1,M00,OUT00,OUT01,OUT02,OUT03,OUT04,OUT05,OUT06,OUT07,OUT08,OUT0 9,OUT10,OUT11', 1.0, 'black', '20', False, 'ATTNDEEPON:IN05-OUT05:attn:1\n\nATTNDEEPOFF:IN05-OUT05:attn:0\n \nPROJDEEPOFF:IN05-OUT05:proj:0\n\nXYZ:::1', False, False) kwargs={} 10:53:25-031369 ERROR gradio call: RuntimeError ╭────────────────────────── Traceback (most recent call last) ──────────────────────────╮ │ C:\AI\SDNEXT\automatic\modules\call_queue.py:31 in f │ │ │ │ 30 │ │ │ try: │ │ ❱ 31 │ │ │ │ res = func(*args, kwargs) │ │ 32 │ │ │ │ progress.record_results(id_task, res) │ │ in img2img:264 │ │ │ │ C:\AI\SDNEXT\automatic\modules\processing.py:193 in process_images │ │ │ │ 192 │ │ │ with context_hypertile_vae(p), context_hypertile_unet(p): │ │ ❱ 193 │ │ │ │ processed = process_images_inner(p) │ │ 194 │ │ │ │ C:\AI\SDNEXT\automatic\extensions-builtin\sd-webui-controlnet\scripts\batch_hijack.py │ │ │ │ 41 │ │ │ # we are not in batch mode, fallback to original function │ │ ❱ 42 │ │ │ return getattr(processing, '__controlnet_original_process_images_in │ │ 43 │ │ │ │ C:\AI\SDNEXT\automatic\modules\processing.py:264 in process_images_inner │ │ │ │ 263 │ │ │ with devices.autocast(): │ │ ❱ 264 │ │ │ │ p.init(p.all_prompts, p.all_seeds, p.all_subseeds) │ │ 265 │ │ extra_network_data = None │ │ │ │ ... 9 frames hidden ... │ │ │ │ C:\AI\SDNEXT\automatic\repositories\ldm\modules\diffusionmodules\model.py:523 in forw │ │ │ │ 522 │ │ # downsampling │ │ ❱ 523 │ │ hs = [self.conv_in(x)] │ │ 524 │ │ for i_level in range(self.num_resolutions): │ │ │ │ C:\AI\SDNEXT\automatic\venv\lib\site-packages\torch\nn\modules\module.py:1511 in _wra │ │ │ │ 1510 │ │ else: │ │ ❱ 1511 │ │ │ return self._call_impl(*args, *kwargs) │ │ 1512 │ │ │ │ C:\AI\SDNEXT\automatic\venv\lib\site-packages\torch\nn\modules\module.py:1520 in _cal │ │ │ │ 1519 │ │ │ │ or _global_forward_hooks or _global_forward_pre_hooks): │ │ ❱ 1520 │ │ │ return forward_call(args, kwargs) │ │ 1521 │ │ │ │ C:\AI\SDNEXT\automatic\venv\lib\site-packages\torch\nn\modules\conv.py:460 in forward │ │ │ │ 459 │ def forward(self, input: Tensor) -> Tensor: │ │ ❱ 460 │ │ return self._conv_forward(input, self.weight, self.bias) │ │ 461 │ │ │ │ C:\AI\SDNEXT\automatic\venv\lib\site-packages\torch\nn\modules\conv.py:456 in _conv_f │ │ │ │ 455 │ │ │ │ │ │ │ _pair(0), self.dilation, self.groups) │ │ ❱ 456 │ │ return F.conv2d(input, weight, bias, self.stride, │ │ 457 │ │ │ │ │ │ self.padding, self.dilation, self.groups) │ ╰─────────────────────────────────────────────────────────────────────────╯ RuntimeError: Expected 3D (unbatched) or 4D (batched) input to conv2d, but got input of size: [0]

RGX650 commented 2 months ago

image_2024-04-26_113025308

normal Img2img is working normally though...

John-WL commented 2 months ago

Okay, I'll try to reproduce this on my side. Thank you for the additional details.

John-WL commented 2 months ago

Hmm I'm not getting the same error. In fact, I don't get any errors at all in the console, very weird. I click on generate, nothing happens. No message in console, no nothing. It does work if the extension is not installed, so it's most likely the extension, but 0 errors in the console is kind of hard to debug!

RGX650 commented 2 months ago

I will reinstall my venv again, only install your extension, and i get back to you

John-WL commented 2 months ago

FYI I had to reinstall SDNext entirely from a fresh clone yesterday, I haven't tried SDNext in months. It was broken. So my venv was reinstalled as well when I tried to reproduce the error.

John-WL commented 2 months ago

Small update, I haven't given up on this issue yet, but I'm very busy at the moment. I'll dig more into it in a few days.

John-WL commented 1 month ago

I've spent a few more hours on it now, but I think I'm giving up on making it work with the patch. I have a suspicion it has something to do with the recompilation patch sdwi2iextender is doing. I'll put my head back into it and make sure it works for SDNext when sdwi2iextender switches fully to the new patch mechanism for Webui 1.9. I'm currently waiting for Forge to switch to 1.9 to implement that into sdwi2iextender, their implementation is still in 1.8.

John-WL commented 2 days ago

Small update on this issue: As of June 2024, forge is no longer an official repo that competes directly with A1111.

Webui 1.9 is still extremely slow compared to Webui dev branch or Forge, so the last working version of Forge will keep being supported in sdwi2iextender until Webui 1.10 is released with all the optimizations made.

This means that this issue definitely won't be fixed before Webui 1.10 is released (possibly a few months still).