comfyanonymous / ComfyUI

The most powerful and modular stable diffusion GUI, api and backend with a graph/nodes interface.
https://www.comfy.org/
GNU General Public License v3.0
42.32k stars 4.48k forks source link

help 1080ti to run comfy ui -->got query.dtype: struct c10::Half key.dtype: float and value.dtype: float #3593

Open cnhh-cyc opened 1 month ago

cnhh-cyc commented 1 month ago

When running, an error "got query.dtype: struct c10::Half key.dtype: float and value.dtype: float" will not be reported on the 4090GPU. After querying the network, I learned that 1080ti does not support float16 calculations. Judging from the results, the calculation has been completed. I tried to change attention.py and CrossAttentionPatch.py ​​according to the AI ​​prompts, but more errors occurred. Is it possible at which step to convert query.dtype to key.dtype.

and google AI also suggest me to this,but it not works.

thanks,waiting for your hlep!many thanks

import torch

def get_dtype(device): """自动判断显卡数据类型""" if torch.cuda.is_available(): properties = torch.cuda.get_device_properties(device)

根据显卡架构或型号判断是否支持 float16

if properties.major >= 8:  # Ampere 架构及更高版本支持 float16
  return torch.float16
else:
  return torch.float32

else: return torch.float32 # CPU 默认使用 float32

示例

query = torch.randn(1, 1, 1, 1) key = torch.randn(1, 1, 1, 1) value = torch.randn(1, 1, 1, 1)

dtype = get_dtype(0) # 获取显卡 0 的数据类型 query = query.to(dtype) key = key.to(dtype) value = value.to(dtype)

继续执行其他操作

shawnington commented 1 month ago

c10 is a pytorch error, originating from the c++ code. AI is not going to fix the code for you unfortunately, and we can't help if you don't post the full error. Also please keep it to english, so more people can understand what is going on.

cnhh-cyc commented 1 month ago

thanks ,here is the log

Error occurred when executing KSampler:

Expected query, key, and value to have the same dtype, but got query.dtype: struct c10::Half key.dtype: float and value.dtype: float instead.

File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\execution.py", line 151, in recursive_execute output_data, output_ui = get_output_data(obj, input_data_all) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\execution.py", line 81, in get_output_data return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\execution.py", line 74, in map_node_over_list results.append(getattr(obj, func)(slice_dict(input_data_all, i))) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\nodes.py", line 1344, in sample return common_ksampler(model, seed, steps, cfg, sampler_name, scheduler, positive, negative, latent_image, denoise=denoise) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\nodes.py", line 1314, in common_ksampler samples = comfy.sample.sample(model, noise, steps, cfg, sampler_name, scheduler, positive, negative, latent_image, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-Advanced-ControlNet\adv_control\control_reference.py", line 47, in refcn_sample return orig_comfy_sample(model, *args, *kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-Advanced-ControlNet\adv_control\utils.py", line 111, in uncond_multiplier_check_cn_sample return orig_comfy_sample(model, args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\comfy\sample.py", line 37, in sample samples = sampler.sample(noise, positive, negative, cfg=cfg, latent_image=latent_image, start_step=start_step, last_step=last_step, force_full_denoise=force_full_denoise, denoise_mask=noise_mask, sigmas=sigmas, callback=callback, disable_pbar=disable_pbar, seed=seed) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 761, in sample return sample(self.model, noise, positive, negative, cfg, self.device, sampler, sigmas, self.model_options, latent_image=latent_image, denoise_mask=denoise_mask, callback=callback, disable_pbar=disable_pbar, seed=seed) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 663, in sample return cfg_guider.sample(noise, latent_image, sampler, sigmas, denoise_mask, callback, disable_pbar, seed) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 650, in sample output = self.inner_sample(noise, latent_image, device, sampler, sigmas, denoise_mask, callback, disable_pbar, seed) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 629, in inner_sample samples = sampler.sample(self, sigmas, extra_args, callback, noise, latent_image, denoise_mask, disable_pbar) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 534, in sample samples = self.sampler_function(model_k, noise, sigmas, extra_args=extra_args, callback=k_callback, disable=disable_pbar, self.extra_options) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context return func(*args, *kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\comfy\k_diffusion\sampling.py", line 542, in sample_dpmpp_sde denoised = model(x, sigmas[i] s_in, extra_args) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 272, in call out = self.inner_model(x, sigma, model_options=model_options, seed=seed) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 616, in call return self.predict_noise(args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 619, in predict_noise return sampling_function(self.inner_model, x, timestep, self.conds.get("negative", None), self.conds.get("positive", None), self.cfg, model_options=model_options, seed=seed) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 258, in sampling_function out = calc_cond_batch(model, conds, x, timestep, model_options) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 218, in calc_cond_batch output = model.apply_model(inputx, timestep, c).chunk(batch_chunks) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-Advanced-ControlNet\adv_control\utils.py", line 63, in apply_model_uncond_cleanup_wrapper return orig_apply_model(self, args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\comfy\model_base.py", line 97, in apply_model model_output = self.diffusion_model(xc, t, context=context, control=control, transformer_options=transformer_options, extra_conds).float() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1511, in _wrapped_call_impl return self._call_impl(*args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1520, in _call_impl return forward_call(*args, *kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\comfy\ldm\modules\diffusionmodules\openaimodel.py", line 852, in forward h = forward_timestep_embed(module, h, emb, context, transformer_options, time_context=time_context, num_video_frames=num_video_frames, image_only_indicator=image_only_indicator) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\comfy\ldm\modules\diffusionmodules\openaimodel.py", line 44, in forward_timestep_embed x = layer(x, context, transformer_options) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1511, in _wrapped_call_impl return self._call_impl(args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1520, in _call_impl return forward_call(*args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\comfy\ldm\modules\attention.py", line 644, in forward x = block(x, context=context[i], transformer_options=transformer_options) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1511, in _wrapped_call_impl return self._call_impl(*args, *kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1520, in _call_impl return forward_call(args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\comfy\ldm\modules\attention.py", line 568, in forward n = attn2_replace_patch[block_attn2](n, context_attn2, value_attn2, extra_options) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI_IPAdapter_plus-main\CrossAttentionPatch.py", line 26, in call out = out + callback(out, q, k, v, extra_options, **self.kwargs[i]) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI_IPAdapter_plus-main\CrossAttentionPatch.py", line 150, in ipadapter_attention out_ip = optimized_attention(q, ip_k, ip_v, extra_options["n_heads"]) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\comfy\ldm\modules\attention.py", line 357, in attention_pytorch out = torch.nn.functional.scaled_dot_product_attention(q, k, v, attn_mask=mask, dropout_p=0.0, is_causal=False)

shawnington commented 1 month ago

so c10::Half is a 16 bit, dtype vs 32 bit for float. q is torch.float16, while k, v are torch.float32

It looks like the issues comes from ComfyUI-Advanced-ControlNet call to attention_pytorch

which is called in the forward method of class CrossAttentionMM(nn.Module) as self.actual_attention

def forward(self, x, context=None, value=None, mask=None, scale_mask=None):
        q = self.to_q(x)
        context = default(context, x)
        k: Tensor = self.to_k(context)
        if value is not None:
            v = self.to_v(value)
            del value
        else:
            v = self.to_v(context)

        # apply custom scale by multiplying k by scale factor
        if self.scale is not None:
            k *= self.scale

        # apply scale mask, if present
        if scale_mask is not None:
            k *= scale_mask

        try:
            out = self.actual_attention(q, k, v, self.heads, mask)
        except RuntimeError as e:
            if str(e).startswith("CUDA error: invalid configuration argument"):
                self.actual_attention = fallback_attention_mm
                out = self.actual_attention(q, k, v, self.heads, mask)
            else:
                raise
        return self.to_out(out)

Im not sure of how the context works for this function, but I am assuming that there is a bug that can cause q, k, v to be cast differently, or there is the assumption that attention_pytorch does forced casting. attention_pytorch does not currently force casting even though there is an unused variable for attn_precision, and attention_basic does force cast.

This causes the error in ComfyUI\comfy\ldm\modules\attention.py attention_pytorch because torch.nn.functional.scaled_dot_product_attention expects q, k, v to be the same dtype.

def attention_pytorch(q, k, v, heads, mask=None, attn_precision=None):
    b, _, dim_head = q.shape
    dim_head //= heads
    q, k, v = map(
        lambda t: t.view(b, -1, heads, dim_head).transpose(1, 2),
        (q, k, v),
    )

    out = torch.nn.functional.scaled_dot_product_attention(q, k, v, attn_mask=mask, dropout_p=0.0, is_causal=False)
    out = (
        out.transpose(1, 2).reshape(b, -1, heads * dim_head)
    )
    return out

attn_precision is not passed into the function, and its unclear on if it would expected to be passed as a dtype, or as a string, as nothing is done with it in the function, however _ATTN_PRECISION is "fp32" unless set to "fp16" by the command line flag: --dont-upcast-attention in attention.py, so I am just going to use that to tell if it should be cast to torch.float32 or torch.float16

These changes should work. Casting is added to the lambda function.

def attention_pytorch(q, k, v, heads, mask=None, attn_precision=None):
    cast_to_type = torch.float32 if _ATTN_PRECISION == "fp32" else torch.float16 
    b, _, dim_head = q.shape
    dim_head //= heads
    q, k, v = map(
        lambda t: t.view(b, -1, heads, dim_head).transpose(1, 2).to(dtype=cast_to_type),
        (q, k, v),
    )

    out = torch.nn.functional.scaled_dot_product_attention(q, k, v, attn_mask=mask, dropout_p=0.0, is_causal=False)
    out = (
      out.transpose(1, 2).reshape(b, -1, heads * dim_head)
    )
    return out

If you could change your attention_pytorch function to match in ComfyUI\comfy\ldm\modules\attention.py and let me know if it fixes the problem.

If it does I'll open a pull-request with the changes.

cnhh-cyc commented 1 month ago

thanks for your help!

"def attention_pytorch(q, k, v, heads, mask=None, attn_precision=None): cast_to_type = torch.float32 if _ATTN_PRECISION == "fp32" else torch.float16 " is " _ATTN_PRECISION" a global variable? I haven't seen the definition in both the upper and lower codes (in attention.py)

and, Completely using your code, errors will be reported at the IPAdapter Style&Composition SDXL node. The previous error occurred in the Ksampler step later on. ======error log========

Error occurred when executing IPAdapterStyleComposition:

name '_ATTN_PRECISION' is not defined

File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\execution.py", line 151, in recursive_execute output_data, output_ui = get_output_data(obj, input_data_all) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\execution.py", line 81, in get_output_data return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\execution.py", line 74, in map_node_over_list results.append(getattr(obj, func)(slice_dict(input_data_all, i))) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI_IPAdapter_plus-main\IPAdapterPlus.py", line 758, in apply_ipadapter work_model, face_image = ipadapter_execute(work_model, ipadapter_model, clip_vision, ipa_args) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI_IPAdapter_plus-main\IPAdapterPlus.py", line 309, in ipadapter_execute img_cond_embeds = encode_image_masked(clipvision, image, batch_size=encode_batch_size) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI_IPAdapter_plus-main\utils.py", line 177, in encode_image_masked out = clip_vision.model(pixel_values=pixel_values, intermediate_output=-2) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1511, in _wrapped_call_impl return self._call_impl(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

shawnington commented 1 month ago

thanks for your help!

"def attention_pytorch(q, k, v, heads, mask=None, attn_precision=None): cast_to_type = torch.float32 if _ATTN_PRECISION == "fp32" else torch.float16 " is " _ATTN_PRECISION" a global variable? I haven't seen the definition in both the upper and lower codes (in attention.py)

and, Completely using your code, errors will be reported at the IPAdapter Style&Composition SDXL node. The previous error occurred in the Ksampler step later on. ======error log========

Error occurred when executing IPAdapterStyleComposition:

name '_ATTN_PRECISION' is not defined

File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\execution.py", line 151, in recursive_execute output_data, output_ui = get_output_data(obj, input_data_all) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\execution.py", line 81, in get_output_data return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\execution.py", line 74, in map_node_over_list results.append(getattr(obj, func)(slice_dict(input_data_all, i))) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI_IPAdapter_plus-main\IPAdapterPlus.py", line 758, in apply_ipadapter work_model, face_image = ipadapter_execute(work_model, ipadapter_model, clip_vision, ipa_args) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI_IPAdapter_plus-main\IPAdapterPlus.py", line 309, in ipadapter_execute img_cond_embeds = encode_image_masked(clipvision, image, batch_size=encode_batch_size) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI_IPAdapter_plus-main\utils.py", line 177, in encode_image_masked out = clip_vision.model(pixel_values=pixel_values, intermediate_output=-2) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1511, in _wrapped_call_impl return self._call_impl(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

How embarrassing, I was working from an outdated branch, it was a global variable that has been changed to:

FORCE_UPCAST_ATTENTION_DTYPE

Also, nicely, is now defaults to torch.float32 unless the --dont-upcast-attention flag is set instead of the string "f32"

But also there is a nice new convenience method in get_attn_precision that uses FORCE_UPCAST_ATTENTIN_DTYPE unless a dtype is passed into the attn_precision variable. This all makes a whole lot more sense now, lol.

FORCE_UPCAST_ATTENTION_DTYPE = model_management.force_upcast_attention_dtype()

def get_attn_precision(attn_precision):
    if args.dont_upcast_attention:
        return None
    if FORCE_UPCAST_ATTENTION_DTYPE is not None:
        return FORCE_UPCAST_ATTENTION_DTYPE
    return attn_precision

Im assuming that this change in how precision is retrieved was made to allow support types such as torch.bfloat16 or torch.bfloat32 so Im not forcing torch.float32 if get_attn_precision returns None. Instead Im making the assumption that, we want to force everything to q.dtype as we are taking q.shape.

I still have no idea where your q gets recast to torch.float16 though (or your k, v to torch.float32 not sure which is happening), however forcing type casting should be in the function anyways, to handle the attn_precision variable.

This code should work and uses the convenience method.

def attention_pytorch(q, k, v, heads, mask=None, attn_precision=None):
    attn_precision = get_attn_precision(attn_precision)
    force_cast_dtype = attn_precision if attn_precision is not None else q.dtype
    b, _, dim_head = q.shape
    dim_head //= heads
    q, k, v = map(
        lambda t: t.view(b, -1, heads, dim_head).transpose(1, 2).to(dtype=force_cast_dtype),
        (q, k, v),
    )

    out = torch.nn.functional.scaled_dot_product_attention(q, k, v, attn_mask=mask, dropout_p=0.0, is_causal=False)
    out = (
      out.transpose(1, 2).reshape(b, -1, heads * dim_head)
    )
    return out

If you could test this change for me again and report back.

cnhh-cyc commented 1 month ago

well,it works!!! and same task 1080ti takes 23minutes,while 4090 takes 1min. many thanks

shawnington commented 1 month ago

well,it works!!! and same task 1080ti takes 23minutes,while 4090 takes 1min. many thanks

Glad to hear! Did you patch the code or use the pull request I submitted?

cnhh-cyc commented 1 month ago

yes! use your code

def attention_pytorch(q, k, v, heads, mask=None, attn_precision=None):
    attn_precision = get_attn_precision(attn_precision)
    force_cast_dtype = attn_precision if attn_precision is not None else q.dtype
    b, _, dim_head = q.shape
    dim_head //= heads
    q, k, v = map(
        lambda t: t.view(b, -1, heads, dim_head).transpose(1, 2).to(dtype=force_cast_dtype),
        (q, k, v),
    )

    out = torch.nn.functional.scaled_dot_product_attention(q, k, v, attn_mask=mask, dropout_p=0.0, is_causal=False)
    out = (
      out.transpose(1, 2).reshape(b, -1, heads * dim_head)
    )
    return out
shawnington commented 1 month ago

If you could also try out the submitted pull request referenced in this issue and let me know if that also fixes the issue. Thanks.