Depth-anything controlnet model not working.

311-code commented 8 months ago

Is there an existing issue for this?

[X] I have searched the existing issues and checked the recent builds/commits

What would your feature do ?

Since this came out I have been trying to use the tiktok depth_anything_vitl14 model that str in the controlnet list for depth. This model is supposed to better than most everything else out there (besides maybe marigold) It never worked for me though and always produced and error.

I finally think figured it out and it was because I was overlooking selecting depth-anything as the preprocessor (and you must be connected to internet for initial download). I'm sure a bunch of other people are experiencing this also and just selecting depth_midas for the preprocessor like they always did before and then trying out the depth anything on the model list only to get an error. I am unsure of the correct resolution settings to use, and feel those should also automatically fill in initially with a good base.

Proposed workflow

I would recommend that when you select certain models such as the depth-anything large model that it default the preprocessor to the correct one to avoid confusion, and select good base resolution settings.

Additional information

No response

311-code commented 8 months ago

Actually it appears it's still not working even after it did the download. I have this error when trying to use depth-anything

2024-02-21 00:24:41,469 - ControlNet - INFO - ControlNet Input Mode: InputMode.SIMPLE
2024-02-21 00:24:41,471 - ControlNet - INFO - Using preprocessor: depth_anything
2024-02-21 00:24:41,471 - ControlNet - INFO - preprocessor resolution = 512
Automatic Memory Management: 1 Modules in 0.21 seconds.
2024-02-21 00:24:42,932 - ControlNet - INFO - Current ControlNet ControlNetPatcher: D:\webui_forge_cu121_torch21\webui\models\ControlNet\depth-anyone.safetensors
2024-02-21 00:24:42,936 - ControlNet - INFO - ControlNet Method depth_anything patched.
To load target model SDXL
To load target model ControlNet
Begin to load 2 models
unload clone 1
unload clone 0
Moving model(s) has taken 2.15 seconds
  0%|                                                                                           | 0/20 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "D:\webui_forge_cu121_torch21\webui\modules_forge\main_thread.py", line 37, in loop
    task.work()
  File "D:\webui_forge_cu121_torch21\webui\modules_forge\main_thread.py", line 26, in work
    self.result = self.func(*self.args, **self.kwargs)
  File "D:\webui_forge_cu121_torch21\webui\modules\txt2img.py", line 111, in txt2img_function
    processed = processing.process_images(p)
  File "D:\webui_forge_cu121_torch21\webui\modules\processing.py", line 750, in process_images
    res = process_images_inner(p)
  File "D:\webui_forge_cu121_torch21\webui\modules\processing.py", line 921, in process_images_inner
    samples_ddim = p.sample(conditioning=p.c, unconditional_conditioning=p.uc, seeds=p.seeds, subseeds=p.subseeds, subseed_strength=p.subseed_strength, prompts=p.prompts)
  File "D:\webui_forge_cu121_torch21\webui\modules\processing.py", line 1276, in sample
    samples = self.sampler.sample(self, x, conditioning, unconditional_conditioning, image_conditioning=self.txt2img_image_conditioning(x))
  File "D:\webui_forge_cu121_torch21\webui\modules\sd_samplers_kdiffusion.py", line 251, in sample
    samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args=self.sampler_extra_args, disable=False, callback=self.callback_state, **extra_params_kwargs))
  File "D:\webui_forge_cu121_torch21\webui\modules\sd_samplers_common.py", line 263, in launch_sampling
    return func()
  File "D:\webui_forge_cu121_torch21\webui\modules\sd_samplers_kdiffusion.py", line 251, in <lambda>
    samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args=self.sampler_extra_args, disable=False, callback=self.callback_state, **extra_params_kwargs))
  File "D:\webui_forge_cu121_torch21\system\python\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "D:\webui_forge_cu121_torch21\webui\repositories\k-diffusion\k_diffusion\sampling.py", line 594, in sample_dpmpp_2m
    denoised = model(x, sigmas[i] * s_in, **extra_args)
  File "D:\webui_forge_cu121_torch21\system\python\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "D:\webui_forge_cu121_torch21\system\python\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "D:\webui_forge_cu121_torch21\webui\modules\sd_samplers_cfg_denoiser.py", line 182, in forward
    denoised = forge_sampler.forge_sample(self, denoiser_params=denoiser_params,
  File "D:\webui_forge_cu121_torch21\webui\modules_forge\forge_sampler.py", line 82, in forge_sample
    denoised = sampling_function(model, x, timestep, uncond, cond, cond_scale, model_options, seed)
  File "D:\webui_forge_cu121_torch21\webui\ldm_patched\modules\samplers.py", line 289, in sampling_function
    cond_pred, uncond_pred = calc_cond_uncond_batch(model, cond, uncond_, x, timestep, model_options)
  File "D:\webui_forge_cu121_torch21\webui\ldm_patched\modules\samplers.py", line 252, in calc_cond_uncond_batch
    c['control'] = control.get_control(input_x, timestep_, control_cond, len(cond_or_uncond))
  File "D:\webui_forge_cu121_torch21\webui\ldm_patched\modules\controlnet.py", line 271, in get_control
    control = self.control_model(x=x_noisy.to(dtype), hint=self.cond_hint.to(self.device), timesteps=timestep.float(), context=context.to(dtype), y=y)
  File "D:\webui_forge_cu121_torch21\system\python\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "D:\webui_forge_cu121_torch21\system\python\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "D:\webui_forge_cu121_torch21\webui\ldm_patched\controlnet\cldm.py", line 305, in forward
    h = module(h, emb, context)
  File "D:\webui_forge_cu121_torch21\system\python\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "D:\webui_forge_cu121_torch21\system\python\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "D:\webui_forge_cu121_torch21\webui\ldm_patched\ldm\modules\diffusionmodules\openaimodel.py", line 74, in forward
    return forward_timestep_embed(self, *args, **kwargs)
  File "D:\webui_forge_cu121_torch21\webui\ldm_patched\ldm\modules\diffusionmodules\openaimodel.py", line 55, in forward_timestep_embed
    x = layer(x, context, transformer_options)
  File "D:\webui_forge_cu121_torch21\system\python\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "D:\webui_forge_cu121_torch21\system\python\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "D:\webui_forge_cu121_torch21\webui\ldm_patched\ldm\modules\attention.py", line 620, in forward
    x = block(x, context=context[i], transformer_options=transformer_options)
  File "D:\webui_forge_cu121_torch21\system\python\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "D:\webui_forge_cu121_torch21\system\python\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "D:\webui_forge_cu121_torch21\webui\ldm_patched\ldm\modules\attention.py", line 447, in forward
    return checkpoint(self._forward, (x, context, transformer_options), self.parameters(), self.checkpoint)
  File "D:\webui_forge_cu121_torch21\webui\ldm_patched\ldm\modules\diffusionmodules\util.py", line 194, in checkpoint
    return func(*inputs)
  File "D:\webui_forge_cu121_torch21\webui\ldm_patched\ldm\modules\attention.py", line 547, in _forward
    n = self.attn2(n, context=context_attn2, value=value_attn2)
  File "D:\webui_forge_cu121_torch21\system\python\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "D:\webui_forge_cu121_torch21\system\python\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "D:\webui_forge_cu121_torch21\webui\ldm_patched\ldm\modules\attention.py", line 391, in forward
    k = self.to_k(context)
  File "D:\webui_forge_cu121_torch21\system\python\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "D:\webui_forge_cu121_torch21\system\python\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "D:\webui_forge_cu121_torch21\webui\ldm_patched\modules\ops.py", line 50, in forward
    return super().forward(*args, **kwargs)
  File "D:\webui_forge_cu121_torch21\system\python\lib\site-packages\torch\nn\modules\linear.py", line 114, in forward
    return F.linear(input, self.weight, self.bias)
RuntimeError: mat1 and mat2 shapes cannot be multiplied (154x2048 and 768x320)
mat1 and mat2 shapes cannot be multiplied (154x2048 and 768x320)
*** Error completing request
*** Arguments: ('task(5essltho0pznm2q)', <gradio.routes.Request object at 0x0000014F6DCEAA70>, 'three people in a rap music video', '', [], 20, 'DPM++ 2M Karras', 1, 1, 7, 1024, 1024, False, 0.7, 2, 'Latent', 0, 0, 0, 'Use same checkpoint', 'Use same sampler', '', '', [], 0, False, '', 0.8, -1, False, -1, 0, 0, 0, False, False, {'ad_model': 'face_yolov8n.pt', 'ad_prompt': '', 'ad_negative_prompt': '', 'ad_confidence': 0.3, 'ad_mask_k_largest': 0, 'ad_mask_min_ratio': 0, 'ad_mask_max_ratio': 1, 'ad_x_offset': 0, 'ad_y_offset': 0, 'ad_dilate_erode': 4, 'ad_mask_merge_invert': 'None', 'ad_mask_blur': 4, 'ad_denoising_strength': 0.4, 'ad_inpaint_only_masked': True, 'ad_inpaint_only_masked_padding': 32, 'ad_use_inpaint_width_height': False, 'ad_inpaint_width': 512, 'ad_inpaint_height': 512, 'ad_use_steps': False, 'ad_steps': 28, 'ad_use_cfg_scale': False, 'ad_cfg_scale': 7, 'ad_use_checkpoint': False, 'ad_checkpoint': 'Use same checkpoint', 'ad_use_vae': False, 'ad_vae': 'Use same VAE', 'ad_use_sampler': False, 'ad_sampler': 'DPM++ 2M Karras', 'ad_use_noise_multiplier': False, 'ad_noise_multiplier': 1, 'ad_use_clip_skip': False, 'ad_clip_skip': 1, 'ad_restore_face': False, 'ad_controlnet_model': 'None', 'ad_controlnet_module': 'None', 'ad_controlnet_weight': 1, 'ad_controlnet_guidance_start': 0, 'ad_controlnet_guidance_end': 1, 'is_api': ()}, {'ad_model': 'None', 'ad_prompt': '', 'ad_negative_prompt': '', 'ad_confidence': 0.3, 'ad_mask_k_largest': 0, 'ad_mask_min_ratio': 0, 'ad_mask_max_ratio': 1, 'ad_x_offset': 0, 'ad_y_offset': 0, 'ad_dilate_erode': 4, 'ad_mask_merge_invert': 'None', 'ad_mask_blur': 4, 'ad_denoising_strength': 0.4, 'ad_inpaint_only_masked': True, 'ad_inpaint_only_masked_padding': 32, 'ad_use_inpaint_width_height': False, 'ad_inpaint_width': 512, 'ad_inpaint_height': 512, 'ad_use_steps': False, 'ad_steps': 28, 'ad_use_cfg_scale': False, 'ad_cfg_scale': 7, 'ad_use_checkpoint': False, 'ad_checkpoint': 'Use same checkpoint', 'ad_use_vae': False, 'ad_vae': 'Use same VAE', 'ad_use_sampler': False, 'ad_sampler': 'DPM++ 2M Karras', 'ad_use_noise_multiplier': False, 'ad_noise_multiplier': 1, 'ad_use_clip_skip': False, 'ad_clip_skip': 1, 'ad_restore_face': False, 'ad_controlnet_model': 'None', 'ad_controlnet_module': 'None', 'ad_controlnet_weight': 1, 'ad_controlnet_guidance_start': 0, 'ad_controlnet_guidance_end': 1, 'is_api': ()}, <scripts.animatediff_ui.AnimateDiffProcess object at 0x0000014F6DC91F00>, False, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, False, -1, -1, 0, '1,1', 'Horizontal', '', 2, 1, ControlNetUnit(input_mode=<InputMode.SIMPLE: 'simple'>, use_preview_as_input=False, batch_image_dir='', batch_mask_dir='', batch_input_gallery=[], batch_mask_gallery=[], generated_image=None, mask_image=None, hr_option='Both', enabled=True, module='depth_anything', model='depth-anyone [48a4bc3a]', weight=2, image={'image': array([[[196, 161, 119],
***         [196, 161, 119],
***         [196, 161, 119],
***         ...,
***         [243, 223, 199],
***         [242, 222, 198],
***         [242, 222, 198]],
***
***        [[196, 161, 119],
***         [196, 161, 119],
***         [196, 161, 119],
***         ...,
***         [243, 223, 199],
***         [243, 223, 199],
***         [242, 222, 198]],
***
***        [[196, 161, 119],
***         [196, 161, 119],
***         [196, 161, 119],
***         ...,
***         [244, 224, 200],
***         [243, 223, 199],
***         [243, 223, 199]],
***
***        ...,
***
***        [[177, 140,  96],
***         [177, 140,  96],
***         [177, 140,  96],
***         ...,
***         [132,  94,  57],
***         [132,  94,  57],
***         [132,  94,  57]],
***
***        [[176, 139,  95],
***         [176, 139,  95],
***         [177, 140,  96],
***         ...,
***         [133,  95,  58],
***         [133,  95,  58],
***         [133,  95,  58]],
***
***        [[176, 139,  95],
***         [176, 139,  95],
***         [176, 139,  95],
***         ...,
***         [133,  95,  58],
***         [133,  95,  58],
***         [133,  95,  58]]], dtype=uint8), 'mask': array([[[0, 0, 0],
***         [0, 0, 0],
***         [0, 0, 0],
***         ...,
***         [0, 0, 0],
***         [0, 0, 0],
***         [0, 0, 0]],
***
***        [[0, 0, 0],
***         [0, 0, 0],
***         [0, 0, 0],
***         ...,
***         [0, 0, 0],
***         [0, 0, 0],
***         [0, 0, 0]],
***
***        [[0, 0, 0],
***         [0, 0, 0],
***         [0, 0, 0],
***         ...,
***         [0, 0, 0],
***         [0, 0, 0],
***         [0, 0, 0]],
***
***        ...,
***
***        [[0, 0, 0],
***         [0, 0, 0],
***         [0, 0, 0],
***         ...,
***         [0, 0, 0],
***         [0, 0, 0],
***         [0, 0, 0]],
***
***        [[0, 0, 0],
***         [0, 0, 0],
***         [0, 0, 0],
***         ...,
***         [0, 0, 0],
***         [0, 0, 0],
***         [0, 0, 0]],
***
***        [[0, 0, 0],
***         [0, 0, 0],
***         [0, 0, 0],
***         ...,
***         [0, 0, 0],
***         [0, 0, 0],
***         [0, 0, 0]]], dtype=uint8)}, resize_mode='Crop and Resize', processor_res=512, threshold_a=0.5, threshold_b=0.5, guidance_start=0, guidance_end=1, pixel_perfect=False, control_mode='Balanced', save_detected_map=True), ControlNetUnit(input_mode=<InputMode.SIMPLE: 'simple'>, use_preview_as_input=False, batch_image_dir='', batch_mask_dir='', batch_input_gallery=[], batch_mask_gallery=[], generated_image=None, mask_image=None, hr_option='Both', enabled=False, module='None', model='None', weight=1, image=None, resize_mode='Crop and Resize', processor_res=-1, threshold_a=-1, threshold_b=-1, guidance_start=0, guidance_end=1, pixel_perfect=False, control_mode='Balanced', save_detected_map=True), ControlNetUnit(input_mode=<InputMode.SIMPLE: 'simple'>, use_preview_as_input=False, batch_image_dir='', batch_mask_dir='', batch_input_gallery=[], batch_mask_gallery=[], generated_image=None, mask_image=None, hr_option='Both', enabled=False, module='None', model='None', weight=1, image=None, resize_mode='Crop and Resize', processor_res=-1, threshold_a=-1, threshold_b=-1, guidance_start=0, guidance_end=1, pixel_perfect=False, control_mode='Balanced', save_detected_map=True), False, 7, 1, 'Constant', 0, 'Constant', 0, 1, 'enable', 'MEAN', 'AD', 1, False, 1.01, 1.02, 0.99, 0.95, False, 0.5, 2, False, 256, 2, 0, False, False, 3, 2, 0, 0.35, True, 'bicubic', 'bicubic', False, 0, 'anisotropic', 0, 'reinhard', 100, 0, 'subtract', 0, 0, 'gaussian', 'add', 0, 100, 127, 0, 'hard_clamp', 5, 0, 'None', 'None', False, 'MultiDiffusion', 768, 768, 64, 4, False, False, False, 'positive', 'comma', 0, False, False, 'start', '', 1, '', [], 0, '', [], 0, '', [], True, False, False, False, False, False, False, 0, False) {}
    Traceback (most recent call last):
      File "D:\webui_forge_cu121_torch21\webui\modules\call_queue.py", line 57, in f
        res = list(func(*args, **kwargs))
    TypeError: 'NoneType' object is not iterable

---

311-code commented 8 months ago

I just realized I was supposed to select the preprocessor like marigold or depth anything and use the diffusers_xl_depth_full. I'm not sure how those models got there, I must have put them there!

lllyasviel / stable-diffusion-webui-forge