Open cnhh-cyc opened 1 month ago
c10 is a pytorch error, originating from the c++ code. AI is not going to fix the code for you unfortunately, and we can't help if you don't post the full error. Also please keep it to english, so more people can understand what is going on.
Error occurred when executing KSampler:
Expected query, key, and value to have the same dtype, but got query.dtype: struct c10::Half key.dtype: float and value.dtype: float instead.
File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\execution.py", line 151, in recursive_execute output_data, output_ui = get_output_data(obj, input_data_all) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\execution.py", line 81, in get_output_data return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\execution.py", line 74, in map_node_over_list results.append(getattr(obj, func)(slice_dict(input_data_all, i))) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\nodes.py", line 1344, in sample return common_ksampler(model, seed, steps, cfg, sampler_name, scheduler, positive, negative, latent_image, denoise=denoise) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\nodes.py", line 1314, in common_ksampler samples = comfy.sample.sample(model, noise, steps, cfg, sampler_name, scheduler, positive, negative, latent_image, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-Advanced-ControlNet\adv_control\control_reference.py", line 47, in refcn_sample return orig_comfy_sample(model, *args, *kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-Advanced-ControlNet\adv_control\utils.py", line 111, in uncond_multiplier_check_cn_sample return orig_comfy_sample(model, args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\comfy\sample.py", line 37, in sample samples = sampler.sample(noise, positive, negative, cfg=cfg, latent_image=latent_image, start_step=start_step, last_step=last_step, force_full_denoise=force_full_denoise, denoise_mask=noise_mask, sigmas=sigmas, callback=callback, disable_pbar=disable_pbar, seed=seed) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 761, in sample return sample(self.model, noise, positive, negative, cfg, self.device, sampler, sigmas, self.model_options, latent_image=latent_image, denoise_mask=denoise_mask, callback=callback, disable_pbar=disable_pbar, seed=seed) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 663, in sample return cfg_guider.sample(noise, latent_image, sampler, sigmas, denoise_mask, callback, disable_pbar, seed) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 650, in sample output = self.inner_sample(noise, latent_image, device, sampler, sigmas, denoise_mask, callback, disable_pbar, seed) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 629, in inner_sample samples = sampler.sample(self, sigmas, extra_args, callback, noise, latent_image, denoise_mask, disable_pbar) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 534, in sample samples = self.sampler_function(model_k, noise, sigmas, extra_args=extra_args, callback=k_callback, disable=disable_pbar, self.extra_options) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context return func(*args, *kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\comfy\k_diffusion\sampling.py", line 542, in sample_dpmpp_sde denoised = model(x, sigmas[i] s_in, extra_args) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 272, in call out = self.inner_model(x, sigma, model_options=model_options, seed=seed) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 616, in call return self.predict_noise(args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 619, in predict_noise return sampling_function(self.inner_model, x, timestep, self.conds.get("negative", None), self.conds.get("positive", None), self.cfg, model_options=model_options, seed=seed) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 258, in sampling_function out = calc_cond_batch(model, conds, x, timestep, model_options) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 218, in calc_cond_batch output = model.apply_model(inputx, timestep, c).chunk(batch_chunks) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-Advanced-ControlNet\adv_control\utils.py", line 63, in apply_model_uncond_cleanup_wrapper return orig_apply_model(self, args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\comfy\model_base.py", line 97, in apply_model model_output = self.diffusion_model(xc, t, context=context, control=control, transformer_options=transformer_options, extra_conds).float() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1511, in _wrapped_call_impl return self._call_impl(*args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1520, in _call_impl return forward_call(*args, *kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\comfy\ldm\modules\diffusionmodules\openaimodel.py", line 852, in forward h = forward_timestep_embed(module, h, emb, context, transformer_options, time_context=time_context, num_video_frames=num_video_frames, image_only_indicator=image_only_indicator) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\comfy\ldm\modules\diffusionmodules\openaimodel.py", line 44, in forward_timestep_embed x = layer(x, context, transformer_options) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1511, in _wrapped_call_impl return self._call_impl(args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1520, in _call_impl return forward_call(*args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\comfy\ldm\modules\attention.py", line 644, in forward x = block(x, context=context[i], transformer_options=transformer_options) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1511, in _wrapped_call_impl return self._call_impl(*args, *kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1520, in _call_impl return forward_call(args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\comfy\ldm\modules\attention.py", line 568, in forward n = attn2_replace_patch[block_attn2](n, context_attn2, value_attn2, extra_options) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI_IPAdapter_plus-main\CrossAttentionPatch.py", line 26, in call out = out + callback(out, q, k, v, extra_options, **self.kwargs[i]) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI_IPAdapter_plus-main\CrossAttentionPatch.py", line 150, in ipadapter_attention out_ip = optimized_attention(q, ip_k, ip_v, extra_options["n_heads"]) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\comfy\ldm\modules\attention.py", line 357, in attention_pytorch out = torch.nn.functional.scaled_dot_product_attention(q, k, v, attn_mask=mask, dropout_p=0.0, is_causal=False)
so c10::Half is a 16 bit, dtype vs 32 bit for float. q is torch.float16, while k, v are torch.float32
It looks like the issues comes from ComfyUI-Advanced-ControlNet call to attention_pytorch
which is called in the forward
method of class CrossAttentionMM(nn.Module)
as self.actual_attention
def forward(self, x, context=None, value=None, mask=None, scale_mask=None):
q = self.to_q(x)
context = default(context, x)
k: Tensor = self.to_k(context)
if value is not None:
v = self.to_v(value)
del value
else:
v = self.to_v(context)
# apply custom scale by multiplying k by scale factor
if self.scale is not None:
k *= self.scale
# apply scale mask, if present
if scale_mask is not None:
k *= scale_mask
try:
out = self.actual_attention(q, k, v, self.heads, mask)
except RuntimeError as e:
if str(e).startswith("CUDA error: invalid configuration argument"):
self.actual_attention = fallback_attention_mm
out = self.actual_attention(q, k, v, self.heads, mask)
else:
raise
return self.to_out(out)
Im not sure of how the context works for this function, but I am assuming that there is a bug that can cause q, k, v
to be cast differently, or there is the assumption that attention_pytorch
does forced casting. attention_pytorch
does not currently force casting even though there is an unused variable for attn_precision
, and attention_basic
does force cast.
This causes the error in ComfyUI\comfy\ldm\modules\attention.py attention_pytorch
because torch.nn.functional.scaled_dot_product_attention
expects q, k, v
to be the same dtype.
def attention_pytorch(q, k, v, heads, mask=None, attn_precision=None):
b, _, dim_head = q.shape
dim_head //= heads
q, k, v = map(
lambda t: t.view(b, -1, heads, dim_head).transpose(1, 2),
(q, k, v),
)
out = torch.nn.functional.scaled_dot_product_attention(q, k, v, attn_mask=mask, dropout_p=0.0, is_causal=False)
out = (
out.transpose(1, 2).reshape(b, -1, heads * dim_head)
)
return out
attn_precision
is not passed into the function, and its unclear on if it would expected to be passed as a dtype, or as a string, as nothing is done with it in the function, however _ATTN_PRECISION
is "fp32" unless set to "fp16" by the command line flag:
--dont-upcast-attention
in attention.py, so I am just going to use that to tell if it should be cast to torch.float32 or torch.float16
These changes should work. Casting is added to the lambda function.
def attention_pytorch(q, k, v, heads, mask=None, attn_precision=None):
cast_to_type = torch.float32 if _ATTN_PRECISION == "fp32" else torch.float16
b, _, dim_head = q.shape
dim_head //= heads
q, k, v = map(
lambda t: t.view(b, -1, heads, dim_head).transpose(1, 2).to(dtype=cast_to_type),
(q, k, v),
)
out = torch.nn.functional.scaled_dot_product_attention(q, k, v, attn_mask=mask, dropout_p=0.0, is_causal=False)
out = (
out.transpose(1, 2).reshape(b, -1, heads * dim_head)
)
return out
If you could change your attention_pytorch
function to match in ComfyUI\comfy\ldm\modules\attention.py
and let me know if it fixes the problem.
If it does I'll open a pull-request with the changes.
thanks for your help!
"def attention_pytorch(q, k, v, heads, mask=None, attn_precision=None): cast_to_type = torch.float32 if _ATTN_PRECISION == "fp32" else torch.float16 " is " _ATTN_PRECISION" a global variable? I haven't seen the definition in both the upper and lower codes (in attention.py)
and, Completely using your code, errors will be reported at the IPAdapter Style&Composition SDXL node. The previous error occurred in the Ksampler step later on. ======error log========
Error occurred when executing IPAdapterStyleComposition:
name '_ATTN_PRECISION' is not defined
File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\execution.py", line 151, in recursive_execute output_data, output_ui = get_output_data(obj, input_data_all) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\execution.py", line 81, in get_output_data return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\execution.py", line 74, in map_node_over_list results.append(getattr(obj, func)(slice_dict(input_data_all, i))) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI_IPAdapter_plus-main\IPAdapterPlus.py", line 758, in apply_ipadapter work_model, face_image = ipadapter_execute(work_model, ipadapter_model, clip_vision, ipa_args) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI_IPAdapter_plus-main\IPAdapterPlus.py", line 309, in ipadapter_execute img_cond_embeds = encode_image_masked(clipvision, image, batch_size=encode_batch_size) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI_IPAdapter_plus-main\utils.py", line 177, in encode_image_masked out = clip_vision.model(pixel_values=pixel_values, intermediate_output=-2) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1511, in _wrapped_call_impl return self._call_impl(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
thanks for your help!
"def attention_pytorch(q, k, v, heads, mask=None, attn_precision=None): cast_to_type = torch.float32 if _ATTN_PRECISION == "fp32" else torch.float16 " is " _ATTN_PRECISION" a global variable? I haven't seen the definition in both the upper and lower codes (in attention.py)
and, Completely using your code, errors will be reported at the IPAdapter Style&Composition SDXL node. The previous error occurred in the Ksampler step later on. ======error log========
Error occurred when executing IPAdapterStyleComposition:
name '_ATTN_PRECISION' is not defined
File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\execution.py", line 151, in recursive_execute output_data, output_ui = get_output_data(obj, input_data_all) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\execution.py", line 81, in get_output_data return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\execution.py", line 74, in map_node_over_list results.append(getattr(obj, func)(slice_dict(input_data_all, i))) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI_IPAdapter_plus-main\IPAdapterPlus.py", line 758, in apply_ipadapter work_model, face_image = ipadapter_execute(work_model, ipadapter_model, clip_vision, ipa_args) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI_IPAdapter_plus-main\IPAdapterPlus.py", line 309, in ipadapter_execute img_cond_embeds = encode_image_masked(clipvision, image, batch_size=encode_batch_size) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI_IPAdapter_plus-main\utils.py", line 177, in encode_image_masked out = clip_vision.model(pixel_values=pixel_values, intermediate_output=-2) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\SD\new_ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1511, in _wrapped_call_impl return self._call_impl(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
How embarrassing, I was working from an outdated branch, it was a global variable that has been changed to:
FORCE_UPCAST_ATTENTION_DTYPE
Also, nicely, is now defaults to torch.float32
unless the --dont-upcast-attention
flag is set instead of the string "f32"
But also there is a nice new convenience method in get_attn_precision
that uses FORCE_UPCAST_ATTENTIN_DTYPE
unless a dtype is passed into the attn_precision
variable. This all makes a whole lot more sense now, lol.
FORCE_UPCAST_ATTENTION_DTYPE = model_management.force_upcast_attention_dtype()
def get_attn_precision(attn_precision):
if args.dont_upcast_attention:
return None
if FORCE_UPCAST_ATTENTION_DTYPE is not None:
return FORCE_UPCAST_ATTENTION_DTYPE
return attn_precision
Im assuming that this change in how precision is retrieved was made to allow support types such as torch.bfloat16
or torch.bfloat32
so Im not forcing torch.float32
if get_attn_precision
returns None. Instead Im making the assumption that, we want to force everything to q.dtype
as we are taking q.shape
.
I still have no idea where your q
gets recast to torch.float16
though (or your k, v
to torch.float32
not sure which is happening), however forcing type casting should be in the function anyways, to handle the attn_precision
variable.
This code should work and uses the convenience method.
def attention_pytorch(q, k, v, heads, mask=None, attn_precision=None):
attn_precision = get_attn_precision(attn_precision)
force_cast_dtype = attn_precision if attn_precision is not None else q.dtype
b, _, dim_head = q.shape
dim_head //= heads
q, k, v = map(
lambda t: t.view(b, -1, heads, dim_head).transpose(1, 2).to(dtype=force_cast_dtype),
(q, k, v),
)
out = torch.nn.functional.scaled_dot_product_attention(q, k, v, attn_mask=mask, dropout_p=0.0, is_causal=False)
out = (
out.transpose(1, 2).reshape(b, -1, heads * dim_head)
)
return out
If you could test this change for me again and report back.
well,it works!!! and same task 1080ti takes 23minutes,while 4090 takes 1min. many thanks
well,it works!!! and same task 1080ti takes 23minutes,while 4090 takes 1min. many thanks
Glad to hear! Did you patch the code or use the pull request I submitted?
yes! use your code
def attention_pytorch(q, k, v, heads, mask=None, attn_precision=None):
attn_precision = get_attn_precision(attn_precision)
force_cast_dtype = attn_precision if attn_precision is not None else q.dtype
b, _, dim_head = q.shape
dim_head //= heads
q, k, v = map(
lambda t: t.view(b, -1, heads, dim_head).transpose(1, 2).to(dtype=force_cast_dtype),
(q, k, v),
)
out = torch.nn.functional.scaled_dot_product_attention(q, k, v, attn_mask=mask, dropout_p=0.0, is_causal=False)
out = (
out.transpose(1, 2).reshape(b, -1, heads * dim_head)
)
return out
If you could also try out the submitted pull request referenced in this issue and let me know if that also fixes the issue. Thanks.
When running, an error "got query.dtype: struct c10::Half key.dtype: float and value.dtype: float" will not be reported on the 4090GPU. After querying the network, I learned that 1080ti does not support float16 calculations. Judging from the results, the calculation has been completed. I tried to change attention.py and CrossAttentionPatch.py according to the AI prompts, but more errors occurred. Is it possible at which step to convert query.dtype to key.dtype.
and google AI also suggest me to this,but it not works.
thanks,waiting for your hlep!many thanks
import torch
def get_dtype(device): """自动判断显卡数据类型""" if torch.cuda.is_available(): properties = torch.cuda.get_device_properties(device)
根据显卡架构或型号判断是否支持 float16
else: return torch.float32 # CPU 默认使用 float32
示例
query = torch.randn(1, 1, 1, 1) key = torch.randn(1, 1, 1, 1) value = torch.randn(1, 1, 1, 1)
dtype = get_dtype(0) # 获取显卡 0 的数据类型 query = query.to(dtype) key = key.to(dtype) value = value.to(dtype)
继续执行其他操作