Open jiho9702 opened 1 year ago
Hey, how is it going? Did you figure it out?
No i didnt ๐ข
Well, that's too bad. I guess we're stuck then.
Are you having the same problem?
I was having that problem when I altered superresolution.py for my use case. You could try running the pipeline provided by diffusers though.. Since this issue got no response, I switched to running the text2img.py, and now, I am getting new error,
(base) root@stablediffusion$ python scripts/txt2img.py --prompt "a professional photograph ofan astronaut riding a horse" --ckpt v2-1_768-nonema-pruned.ckpt --config configs/stable-diffusion/v2-inference-v.yaml --H 768--W 768"
"RuntimeError: expected scalar type BFloat16 but found Float"
Wait, you're getting the problem from something else, not the superresolution.py script
Ok, I was fooling around, and got "RuntimeError: Input type (c10::Half) and bias type (float) should be the same" error again. It doesn't show up when you use "cuda". Why are you trying mps instead?
You can use this fork supported mps! https://github.com/Tps-F/stablediffusion
it's not work ๐ฅบ
this occurred error
Traceback (most recent call last):
File "/Users/blackcat/study/stablediffusion/scripts/txt2img.py", line 393, in
After days of troubleshooting, I was able to resolve this by upgrading tensorflow to 2.11.0 and editing the v2-inference.yaml
file's parameter of use_fp16
to False
Try to use v2-inference-v-mac.yaml
@lakejee-rebel @Tps-F How long does it take to execute? It takes an hour to create an image on Tps-F's stable diffusion model
@Tps-F It's faster because I reduced the batch size. Thank you. Are you interested in object detection like ssd(single shot multibox detector) or YOLO? I want trying ssd in m1 Mac but that model used to CUDA how to convert CUDA to MPS?
Shall I do it?
I'd appreciate it if you did that.
@Tps-F Can i follow you?
Sure! By the way, There seems to be more than one in ssd and YOLO, which one should I support?
Since we are not going to talk here, would you like to go to the discord or something?
Okay good what is your discord id? I will follow you
Thank you- Ftps#3389
@Tps-F Hi, I get a similar error using your fork:
.../venv/lib/python3.10/site-packages/torch/nn/modules/conv.py", line 459, in _conv_forward
return F.conv2d(input, weight, bias, self.stride,
RuntimeError: Input type (float) and bias type (c10::Half) should be the same
I used your v2-inference-v-mac.yaml
as well, and updated tensorflow to 2.11.0 as suggested but it doesn't work...
Can I connect with you on Discord? I already sent a request... :)
I would like to see all the logs and what you have run. Can you show me?
Can I connect with you on Discord? I already sent a request... :)
Sure! But I might as well talk about it here in case anyone encounters a similar error in the future!
return F.conv2d(input, weight, bias, self.stride,
RuntimeError: Input type (c10::Half) and bias type (float) should be the same
I also got the above problem @yyahav . I am using ubuntu not macOS
@tommysnu Can you please share the entire stacktrace? I've made a change in the code which seems to work for me
Likewise, please share your logs with us so I can improve.
Likewise, please share your logs with us so I can improve.
Traceback (most recent call last):
File "/mnt/workspace/stablediffusion/scripts/txt2img.py", line 388, in <module>
main(opt)
File "/mnt/workspace/stablediffusion/scripts/txt2img.py", line 347, in main
samples, _ = sampler.sample(S=opt.steps,
File "/home/tommy/anaconda3/envs/t2im/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/mnt/workspace/stablediffusion/ldm/models/diffusion/ddim.py", line 104, in sample
samples, intermediates = self.ddim_sampling(conditioning, size,
File "/home/tommy/anaconda3/envs/t2im/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/mnt/workspace/stablediffusion/ldm/models/diffusion/ddim.py", line 164, in ddim_sampling
outs = self.p_sample_ddim(img, cond, ts, index=index, use_original_steps=ddim_use_original_steps,
File "/home/tommy/anaconda3/envs/t2im/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/mnt/workspace/stablediffusion/ldm/models/diffusion/ddim.py", line 212, in p_sample_ddim
model_uncond, model_t = self.model.apply_model(x_in, t_in, c_in).chunk(2)
File "/mnt/workspace/stablediffusion/ldm/models/diffusion/ddpm.py", line 858, in apply_model
x_recon = self.model(x_noisy, t, **cond)
File "/home/tommy/anaconda3/envs/t2im/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/mnt/workspace/stablediffusion/ldm/models/diffusion/ddpm.py", line 1335, in forward
out = self.diffusion_model(x, t, context=cc)
File "/home/tommy/anaconda3/envs/t2im/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/mnt/workspace/stablediffusion/ldm/modules/diffusionmodules/openaimodel.py", line 797, in forward
h = module(h, emb, context)
File "/home/tommy/anaconda3/envs/t2im/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/mnt/workspace/stablediffusion/ldm/modules/diffusionmodules/openaimodel.py", line 86, in forward
x = layer(x)
File "/home/tommy/anaconda3/envs/t2im/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/home/tommy/anaconda3/envs/t2im/lib/python3.9/site-packages/torch/nn/modules/conv.py", line 463, in forward
return self._conv_forward(input, self.weight, self.bias)
File "/home/tommy/anaconda3/envs/t2im/lib/python3.9/site-packages/torch/nn/modules/conv.py", line 459, in _conv_forward
return F.conv2d(input, weight, bias, self.stride,
RuntimeError: Input type (c10::Half) and bias type (float) should be the same
This is my logs after I run:
python scripts/txt2img.py --prompt "a professional photograph of an astronaut riding a horse" --ckpt v2-1_768-ema-pruned.ckpt --config configs/stable-diffusion/v2-inference-v.yaml --H 768 --W 768
(Link: https://github.com/Stability-AI/stablediffusion#reference-sampling-script)
Could you give me any suggestion @yyahav and @Tps-F ? Thank you so much
I know you are using ubuntu, could you try using config for mac? https://github.com/Tps-F/stablediffusion/blob/mps-cpu-support/configs/stable-diffusion/mac/v2-inference-v-mac.yaml
I think the reason this happens is because you are using fp16
I know you are using ubuntu, could you try using config for mac? https://github.com/Tps-F/stablediffusion/blob/mps-cpu-support/configs/stable-diffusion/mac/v2-inference-v-mac.yaml
Thanks Tps-F. After using this config file I get other error as bellow:
Sampling: 0%| | 0/3 [00:00<?, ?it/sData shape for DDIM sampling is (3, 4, 96, 96), eta 0.0 | 0/1 [00:00<?, ?it/s]
Running DDIM Sampling with 50 timesteps
DDIM Sampler: 0%| | 0/50 [00:00<?, ?it/s]
data: 0%| | 0/1 [00:02<?, ?it/s]
Sampling: 0%| | 0/3 [00:02<?, ?it/s]
โญโโโโโโโโโโโโโโโโโโโโโโโโโโโ Traceback (most recent call last) โโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ /mnt/workspace/stablediffusion/scripts/txt2img.py:388 in <module> โ
โ โ
โ 385 โ
โ 386 if __name__ == "__main__": โ
โ 387 โ opt = parse_args() โ
โ โฑ 388 โ main(opt) โ
โ 389 โ
โ โ
โ /mnt/workspace/stablediffusion/scripts/txt2img.py:347 in main โ
โ โ
โ 344 โ โ โ โ โ โ prompts = list(prompts) โ
โ 345 โ โ โ โ โ c = model.get_learned_conditioning(prompts) โ
โ 346 โ โ โ โ โ shape = [opt.C, opt.H // opt.f, opt.W // opt.f] โ
โ โฑ 347 โ โ โ โ โ samples, _ = sampler.sample(S=opt.steps, โ
โ 348 โ โ โ โ โ โ โ โ โ โ โ โ โ conditioning=c, โ
โ 349 โ โ โ โ โ โ โ โ โ โ โ โ โ batch_size=opt.n_samples, โ
โ 350 โ โ โ โ โ โ โ โ โ โ โ โ โ shape=shape, โ
โ โ
โ /home/tommy/anaconda3/envs/t2im/lib/python3.9/site-packages/torch/autograd/grad_mode.py โ
โ :27 in decorate_context โ
โ โ
โ 24 โ โ @functools.wraps(func) โ
โ 25 โ โ def decorate_context(*args, **kwargs): โ
โ 26 โ โ โ with self.clone(): โ
โ โฑ 27 โ โ โ โ return func(*args, **kwargs) โ
โ 28 โ โ return cast(F, decorate_context) โ
โ 29 โ โ
โ 30 โ def _wrap_generator(self, func): โ
โ โ
โ /mnt/workspace/stablediffusion/ldm/models/diffusion/ddim.py:104 in sample โ
โ โ
โ 101 โ โ size = (batch_size, C, H, W) โ
โ 102 โ โ print(f'Data shape for DDIM sampling is {size}, eta {eta}') โ
โ 103 โ โ โ
โ โฑ 104 โ โ samples, intermediates = self.ddim_sampling(conditioning, size, โ
โ 105 โ โ โ โ โ โ โ โ โ โ โ โ โ callback=callback, โ
โ 106 โ โ โ โ โ โ โ โ โ โ โ โ โ img_callback=img_callback, โ
โ 107 โ โ โ โ โ โ โ โ โ โ โ โ โ quantize_denoised=quantize_x0 โ
โ โ
โ /home/tommy/anaconda3/envs/t2im/lib/python3.9/site-packages/torch/autograd/grad_mode.py โ
โ :27 in decorate_context โ
โ โ
โ 24 โ โ @functools.wraps(func) โ
โ 25 โ โ def decorate_context(*args, **kwargs): โ
โ 26 โ โ โ with self.clone(): โ
โ โฑ 27 โ โ โ โ return func(*args, **kwargs) โ
โ 28 โ โ return cast(F, decorate_context) โ
โ 29 โ โ
โ 30 โ def _wrap_generator(self, func): โ
โ โ
โ /mnt/workspace/stablediffusion/ldm/models/diffusion/ddim.py:164 in ddim_sampling โ
โ โ
โ 161 โ โ โ โ assert len(ucg_schedule) == len(time_range) โ
โ 162 โ โ โ โ unconditional_guidance_scale = ucg_schedule[i] โ
โ 163 โ โ โ โ
โ โฑ 164 โ โ โ outs = self.p_sample_ddim(img, cond, ts, index=index, use_original_st โ
โ 165 โ โ โ โ โ โ โ โ โ quantize_denoised=quantize_denoised, temper โ
โ 166 โ โ โ โ โ โ โ โ โ noise_dropout=noise_dropout, score_correcto โ
โ 167 โ โ โ โ โ โ โ โ โ corrector_kwargs=corrector_kwargs, โ
โ โ
โ /home/tommy/anaconda3/envs/t2im/lib/python3.9/site-packages/torch/autograd/grad_mode.py โ
โ :27 in decorate_context โ
โ โ
โ 24 โ โ @functools.wraps(func) โ
โ 25 โ โ def decorate_context(*args, **kwargs): โ
โ 26 โ โ โ with self.clone(): โ
โ โฑ 27 โ โ โ โ return func(*args, **kwargs) โ
โ 28 โ โ return cast(F, decorate_context) โ
โ 29 โ โ
โ 30 โ def _wrap_generator(self, func): โ
โ โ
โ /mnt/workspace/stablediffusion/ldm/models/diffusion/ddim.py:212 in p_sample_ddim โ
โ โ
โ 209 โ โ โ โ โ c_in.append(torch.cat([unconditional_conditioning[i], c[i]])) โ
โ 210 โ โ โ else: โ
โ 211 โ โ โ โ c_in = torch.cat([unconditional_conditioning, c]) โ
โ โฑ 212 โ โ โ model_uncond, model_t = self.model.apply_model(x_in, t_in, c_in).chun โ
โ 213 โ โ โ model_output = model_uncond + unconditional_guidance_scale * (model_t โ
โ 214 โ โ โ
โ 215 โ โ if self.model.parameterization == "v": โ
โ โ
โ /mnt/workspace/stablediffusion/ldm/models/diffusion/ddpm.py:858 in apply_model โ
โ โ
โ 855 โ โ โ key = 'c_concat' if self.model.conditioning_key == 'concat' else 'c_ โ
โ 856 โ โ โ cond = {key: cond} โ
โ 857 โ โ โ
โ โฑ 858 โ โ x_recon = self.model(x_noisy, t, **cond) โ
โ 859 โ โ โ
โ 860 โ โ if isinstance(x_recon, tuple) and not return_ids: โ
โ 861 โ โ โ return x_recon[0] โ
โ โ
โ /home/tommy/anaconda3/envs/t2im/lib/python3.9/site-packages/torch/nn/modules/module.py: โ
โ 1194 in _call_impl โ
โ โ
โ 1191 โ โ # this function, and just call forward. โ
โ 1192 โ โ if not (self._backward_hooks or self._forward_hooks or self._forward_pre โ
โ 1193 โ โ โ โ or _global_forward_hooks or _global_forward_pre_hooks): โ
โ โฑ 1194 โ โ โ return forward_call(*input, **kwargs) โ
โ 1195 โ โ # Do not call functions when jit is used โ
โ 1196 โ โ full_backward_hooks, non_full_backward_hooks = [], [] โ
โ 1197 โ โ if self._backward_hooks or _global_backward_hooks: โ
โ โ
โ /mnt/workspace/stablediffusion/ldm/models/diffusion/ddpm.py:1335 in forward โ
โ โ
โ 1332 โ โ โ โ # an error: RuntimeError: forward() is missing value for argumen โ
โ 1333 โ โ โ โ out = self.scripted_diffusion_model(x, t, cc) โ
โ 1334 โ โ โ else: โ
โ โฑ 1335 โ โ โ โ out = self.diffusion_model(x, t, context=cc) โ
โ 1336 โ โ elif self.conditioning_key == 'hybrid': โ
โ 1337 โ โ โ xc = torch.cat([x] + c_concat, dim=1) โ
โ 1338 โ โ โ cc = torch.cat(c_crossattn, 1) โ
โ โ
โ /home/tommy/anaconda3/envs/t2im/lib/python3.9/site-packages/torch/nn/modules/module.py: โ
โ 1194 in _call_impl โ
โ โ
โ 1191 โ โ # this function, and just call forward. โ
โ 1192 โ โ if not (self._backward_hooks or self._forward_hooks or self._forward_pre โ
โ 1193 โ โ โ โ or _global_forward_hooks or _global_forward_pre_hooks): โ
โ โฑ 1194 โ โ โ return forward_call(*input, **kwargs) โ
โ 1195 โ โ # Do not call functions when jit is used โ
โ 1196 โ โ full_backward_hooks, non_full_backward_hooks = [], [] โ
โ 1197 โ โ if self._backward_hooks or _global_backward_hooks: โ
โ โ
โ /mnt/workspace/stablediffusion/ldm/modules/diffusionmodules/openaimodel.py:797 in โ
โ forward โ
โ โ
โ 794 โ โ โ
โ 795 โ โ h = x.type(self.dtype) โ
โ 796 โ โ for module in self.input_blocks: โ
โ โฑ 797 โ โ โ h = module(h, emb, context) โ
โ 798 โ โ โ hs.append(h) โ
โ 799 โ โ h = self.middle_block(h, emb, context) โ
โ 800 โ โ for module in self.output_blocks: โ
โ โ
โ /home/tommy/anaconda3/envs/t2im/lib/python3.9/site-packages/torch/nn/modules/module.py: โ
โ 1194 in _call_impl โ
โ โ
โ 1191 โ โ # this function, and just call forward. โ
โ 1192 โ โ if not (self._backward_hooks or self._forward_hooks or self._forward_pre โ
โ 1193 โ โ โ โ or _global_forward_hooks or _global_forward_pre_hooks): โ
โ โฑ 1194 โ โ โ return forward_call(*input, **kwargs) โ
โ 1195 โ โ # Do not call functions when jit is used โ
โ 1196 โ โ full_backward_hooks, non_full_backward_hooks = [], [] โ
โ 1197 โ โ if self._backward_hooks or _global_backward_hooks: โ
โ โ
โ /mnt/workspace/stablediffusion/ldm/modules/diffusionmodules/openaimodel.py:84 in โ
โ forward โ
โ โ
โ 81 โ โ โ if isinstance(layer, TimestepBlock): โ
โ 82 โ โ โ โ x = layer(x, emb) โ
โ 83 โ โ โ elif isinstance(layer, SpatialTransformer): โ
โ โฑ 84 โ โ โ โ x = layer(x, context) โ
โ 85 โ โ โ else: โ
โ 86 โ โ โ โ x = layer(x) โ
โ 87 โ โ return x โ
โ โ
โ /home/tommy/anaconda3/envs/t2im/lib/python3.9/site-packages/torch/nn/modules/module.py: โ
โ 1194 in _call_impl โ
โ โ
โ 1191 โ โ # this function, and just call forward. โ
โ 1192 โ โ if not (self._backward_hooks or self._forward_hooks or self._forward_pre โ
โ 1193 โ โ โ โ or _global_forward_hooks or _global_forward_pre_hooks): โ
โ โฑ 1194 โ โ โ return forward_call(*input, **kwargs) โ
โ 1195 โ โ # Do not call functions when jit is used โ
โ 1196 โ โ full_backward_hooks, non_full_backward_hooks = [], [] โ
โ 1197 โ โ if self._backward_hooks or _global_backward_hooks: โ
โ โ
โ /mnt/workspace/stablediffusion/ldm/modules/attention.py:334 in forward โ
โ โ
โ 331 โ โ if self.use_linear: โ
โ 332 โ โ โ x = self.proj_in(x) โ
โ 333 โ โ for i, block in enumerate(self.transformer_blocks): โ
โ โฑ 334 โ โ โ x = block(x, context=context[i]) โ
โ 335 โ โ if self.use_linear: โ
โ 336 โ โ โ x = self.proj_out(x) โ
โ 337 โ โ x = rearrange(x, 'b (h w) c -> b c h w', h=h, w=w).contiguous() โ
โ โ
โ /home/tommy/anaconda3/envs/t2im/lib/python3.9/site-packages/torch/nn/modules/module.py: โ
โ 1194 in _call_impl โ
โ โ
โ 1191 โ โ # this function, and just call forward. โ
โ 1192 โ โ if not (self._backward_hooks or self._forward_hooks or self._forward_pre โ
โ 1193 โ โ โ โ or _global_forward_hooks or _global_forward_pre_hooks): โ
โ โฑ 1194 โ โ โ return forward_call(*input, **kwargs) โ
โ 1195 โ โ # Do not call functions when jit is used โ
โ 1196 โ โ full_backward_hooks, non_full_backward_hooks = [], [] โ
โ 1197 โ โ if self._backward_hooks or _global_backward_hooks: โ
โ โ
โ /mnt/workspace/stablediffusion/ldm/modules/attention.py:269 in forward โ
โ โ
โ 266 โ โ self.checkpoint = checkpoint โ
โ 267 โ โ
โ 268 โ def forward(self, x, context=None): โ
โ โฑ 269 โ โ return checkpoint(self._forward, (x, context), self.parameters(), self.ch โ
โ 270 โ โ
โ 271 โ def _forward(self, x, context=None): โ
โ 272 โ โ x = self.attn1(self.norm1(x), context=context if self.disable_self_attn e โ
โ โ
โ /mnt/workspace/stablediffusion/ldm/modules/diffusionmodules/util.py:121 in checkpoint โ
โ โ
โ 118 โ """ โ
โ 119 โ if flag: โ
โ 120 โ โ args = tuple(inputs) + tuple(params) โ
โ โฑ 121 โ โ return CheckpointFunction.apply(func, len(inputs), *args) โ
โ 122 โ else: โ
โ 123 โ โ return func(*inputs) โ
โ 124 โ
โ โ
โ /mnt/workspace/stablediffusion/ldm/modules/diffusionmodules/util.py:136 in forward โ
โ โ
โ 133 โ โ โ โ โ โ โ โ "dtype": torch.get_autocast_gpu_dtype(), โ
โ 134 โ โ โ โ โ โ โ โ "cache_enabled": torch.is_autocast_cache_enabl โ
โ 135 โ โ with torch.no_grad(): โ
โ โฑ 136 โ โ โ output_tensors = ctx.run_function(*ctx.input_tensors) โ
โ 137 โ โ return output_tensors โ
โ 138 โ โ
โ 139 โ @staticmethod โ
โ โ
โ /mnt/workspace/stablediffusion/ldm/modules/attention.py:272 in _forward โ
โ โ
โ 269 โ โ return checkpoint(self._forward, (x, context), self.parameters(), self.ch โ
โ 270 โ โ
โ 271 โ def _forward(self, x, context=None): โ
โ โฑ 272 โ โ x = self.attn1(self.norm1(x), context=context if self.disable_self_attn e โ
โ 273 โ โ x = self.attn2(self.norm2(x), context=context) + x โ
โ 274 โ โ x = self.ff(self.norm3(x)) + x โ
โ 275 โ โ return x โ
โ โ
โ /home/tommy/anaconda3/envs/t2im/lib/python3.9/site-packages/torch/nn/modules/module.py: โ
โ 1194 in _call_impl โ
โ โ
โ 1191 โ โ # this function, and just call forward. โ
โ 1192 โ โ if not (self._backward_hooks or self._forward_hooks or self._forward_pre โ
โ 1193 โ โ โ โ or _global_forward_hooks or _global_forward_pre_hooks): โ
โ โฑ 1194 โ โ โ return forward_call(*input, **kwargs) โ
โ 1195 โ โ # Do not call functions when jit is used โ
โ 1196 โ โ full_backward_hooks, non_full_backward_hooks = [], [] โ
โ 1197 โ โ if self._backward_hooks or _global_backward_hooks: โ
โ โ
โ /mnt/workspace/stablediffusion/ldm/modules/attention.py:233 in forward โ
โ โ
โ 230 โ โ ) โ
โ 231 โ โ โ
โ 232 โ โ # actually compute the attention, what we cannot get enough of โ
โ โฑ 233 โ โ out = xformers.ops.memory_efficient_attention(q, k, v, attn_bias=None, op โ
โ 234 โ โ โ
โ 235 โ โ if exists(mask): โ
โ 236 โ โ โ raise NotImplementedError โ
โ โ
โ /mnt/workspace/xformers/xformers/ops/fmha/__init__.py:192 in memory_efficient_attention โ
โ โ
โ 189 โ โ and options. โ
โ 190 โ :return: multi-head attention Tensor with shape ``[B, Mq, H, Kv]`` โ
โ 191 โ """ โ
โ โฑ 192 โ return _memory_efficient_attention( โ
โ 193 โ โ Inputs( โ
โ 194 โ โ โ query=query, key=key, value=value, p=p, attn_bias=attn_bias, scale=sc โ
โ 195 โ โ ), โ
โ โ
โ /mnt/workspace/xformers/xformers/ops/fmha/__init__.py:290 in โ
โ _memory_efficient_attention โ
โ โ
โ 287 ) -> torch.Tensor: โ
โ 288 โ # fast-path that doesn't require computing the logsumexp for backward computa โ
โ 289 โ if all(x.requires_grad is False for x in [inp.query, inp.key, inp.value]): โ
โ โฑ 290 โ โ return _memory_efficient_attention_forward( โ
โ 291 โ โ โ inp, op=op[0] if op is not None else None โ
โ 292 โ โ ) โ
โ 293 โ
โ โ
โ /mnt/workspace/xformers/xformers/ops/fmha/__init__.py:306 in โ
โ _memory_efficient_attention_forward โ
โ โ
โ 303 โ inp.validate_inputs() โ
โ 304 โ output_shape = inp.normalize_bmhk() โ
โ 305 โ if op is None: โ
โ โฑ 306 โ โ op = _dispatch_fw(inp) โ
โ 307 โ else: โ
โ 308 โ โ _ensure_op_supports_or_raise(ValueError, "memory_efficient_attention", op โ
โ 309 โ
โ โ
โ /mnt/workspace/xformers/xformers/ops/fmha/dispatch.py:98 in _dispatch_fw โ
โ โ
โ 95 โ if _is_triton_fwd_fastest(inp): โ
โ 96 โ โ priority_list_ops.remove(triton.FwOp) โ
โ 97 โ โ priority_list_ops.insert(0, triton.FwOp) โ
โ โฑ 98 โ return _run_priority_list( โ
โ 99 โ โ "memory_efficient_attention_forward", priority_list_ops, inp โ
โ 100 โ ) โ
โ 101 โ
โ โ
โ /mnt/workspace/xformers/xformers/ops/fmha/dispatch.py:73 in _run_priority_list โ
โ โ
โ 70 {textwrap.indent(_format_inputs_description(inp), ' ')}""" โ
โ 71 โ for op, not_supported in zip(priority_list, not_supported_reasons): โ
โ 72 โ โ msg += "\n" + _format_not_supported_reasons(op, not_supported) โ
โ โฑ 73 โ raise NotImplementedError(msg) โ
โ 74 โ
โ 75 โ
โ 76 def _dispatch_fw(inp: Inputs) -> Type[AttentionFwOpBase]: โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
NotImplementedError: No operator found for `memory_efficient_attention_forward` with
inputs:
query : shape=(30, 9216, 1, 64) (torch.float32)
key : shape=(30, 9216, 1, 64) (torch.float32)
value : shape=(30, 9216, 1, 64) (torch.float32)
attn_bias : <class 'NoneType'>
p : 0.0
`cutlassF` is not supported because:
device=cpu (supported: {'cuda'})
`flshattF` is not supported because:
device=cpu (supported: {'cuda'})
dtype=torch.float32 (supported: {torch.float16, torch.bfloat16})
`tritonflashattF` is not supported because:
device=cpu (supported: {'cuda'})
dtype=torch.float32 (supported: {torch.float16, torch.bfloat16})
`smallkF` is not supported because:
max(query.shape[-1] != value.shape[-1]) > 32
unsupported embed per head: 64
Could you try to remove --xformers flag?
python scripts/txt2img.py --prompt "a professional photograph of an astronaut riding a horse" --ckpt checkpoints/v2-1_768-ema-pruned.ckpt --config configs/stable-diffusion/v2-inference-mac.yaml --H 768 --W 768 --precision full
I dont use --xformers
Could you try to remove --xformers flag?
I think that error is caused by xformers, so I'd like you to try uninstalling it
when use diffuser pipeline , i meet this,
my solution is :
pipe.to("cuda", torch.float16)
try :
variable.to(device, torch_type)
and then it run no problem
I receive the same error when using txt2img on webui. Running on Docker without venv (see Dockerfile at https://github.com/helv-io/stable-diffusion-webui). It's an untainted build.
Running on a 8GB Tesla P4. Generation works, but in the webui, I lose the progress bar.
Any advice? I tried changing use_fp16
to False
, but same thing.
im using ubuntu transformers==4.25.1 timm==0.5.4
& i change pretrained_model.encoder.to(torch.bfloat16) to pretrained_model.encoder.to(torch.float) now it works for me
I was getting these, and eventually deduced what was going on. These errors appeared during the render process when it'd show updates. Instead of updates, these errors were appearing... I did some research. I had been moving the default models folder and using my own. The folder I had didn't have various files it was looking for. It turns out I needed to merge in the default files to get it working...
Here for example is what I typed in...
cd stable-diffusion-webui
git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui.git SD-hold
cp -r SD-hold/models/* models/
rm -rf SD-hold
That restores the files to the models folder tree, and solved this for me.
I was getting these, and eventually deduced what was going on. These errors appeared during the render process when it'd show updates. Instead of updates, these errors were appearing... I did some research. I had been moving the default models folder and using my own. The folder I had didn't have various files it was looking for. It turns out I needed to merge in the default files to get it working...
Here for example is what I typed in...
cd stable-diffusion-webui git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui.git SD-hold cp -r SD-hold/models/* models/ rm -rf SD-hold
That restores the files to the models folder tree, and solved this for me.
This fixed, it, thanks! Perhaps those required files could be moved somewhere else, leaving the models folder empty for user-managed models? This would improve containerization.
i use yolov5 ๏ผwhen i put ACmix block into my code leading the same mistake,someone can help me?please! File "c:/Users/xie/Desktop/yolov5-master/train.py", line 541, in main train(opt.hyp, opt, device, callbacks) File "c:/Users/xie/Desktop/yolov5-master/train.py", line 374, in train compute_loss=compute_loss) File "D:\Anaconda\envs\pytorch\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context return func(*args, kwargs) File "c:\Users\xie\Desktop\yolov5-master\val.py", line 210, in run preds, train_out = model(im) if compute_loss else (model(im, augment=augment), None) File "D:\Anaconda\envs\pytorch\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl return forward_call(*input, *kwargs) File "c:\Users\xie\Desktop\yolov5-master\models\yolo.py", line 209, in forward return self._forward_once(x, profile, visualize) # single-scale inference, train File "c:\Users\xie\Desktop\yolov5-master\models\yolo.py", line 121, in _forward_once x = m(x) # run File "D:\Anaconda\envs\pytorch\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl return forward_call(input, kwargs) File "c:\Users\xie\Desktop\yolov5-master\models\common.py", line 947, in forward pe = self.conv_p(position(h, w, x.is_cuda)) File "D:\Anaconda\envs\pytorch\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl return forward_call(*input, **kwargs) File "D:\Anaconda\envs\pytorch\lib\site-packages\torch\nn\modules\conv.py", line 467, in forward return self._conv_forward(input, self.weight, self.bias) File "D:\Anaconda\envs\pytorch\lib\site-packages\torch\nn\modules\conv.py", line 464, in _conv_forward self.padding, self.dilation, self.groups) RuntimeError: Input type (float) and bias type (struct c10::Half) should be the same
I got a similar RuntimeError when using SDXL in the AUTOMATIC1111 interface. The --no-half-vae
option fixed it for me.
I believe there is a VRAM memory leak which happens if the model couldn't be upscaled. It causes VRAM to never release the acquired space (unless you restart webui
).
If you happened to see your image couldn't be upscaled - it means your VRAM is now leaked. In that case, you should restart the whole webui
to be able to utilize it properly.
My specs:
GPU: Nvidia Quadro RTX 5000 GPU (16 GB VRAM).
OS: Fedora 38.
Acquired VRAM before webui
is started:
$ nvidia-smi
Sat Aug 12 01:26:05 2023
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.86.05 Driver Version: 535.86.05 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 Quadro RTX 5000 Off | 00000000:01:00.0 On | N/A |
| N/A 55C P5 10W / 90W | 863MiB / 16384MiB | 1% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| 0 N/A N/A 2998 G /usr/bin/gnome-shell 463MiB |
| 0 N/A N/A 7491 G /usr/bin/Xwayland 7MiB |
| 0 N/A N/A 20394 G /usr/bin/nautilus 84MiB |
| 0 N/A N/A 35817 G /usr/lib64/firefox/firefox 301MiB |
+---------------------------------------------------------------------------------------+
Acquired VRAM after webui
is fully started, but before the first image generation job is triggered. I see 7 GB of VRAM is now acquired by webui even so there were no any image generation tasks submitted. Not sure if that's expected or not. It could be that webui is loading the checkpoint model into VRAM on the start, but I'm not sure.:
$ nvidia-smi
Sat Aug 12 01:38:02 2023
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.86.05 Driver Version: 535.86.05 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 Quadro RTX 5000 Off | 00000000:01:00.0 On | N/A |
| N/A 55C P8 4W / 90W | 8326MiB / 16384MiB | 1% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| 0 N/A N/A 2998 G /usr/bin/gnome-shell 462MiB |
| 0 N/A N/A 7491 G /usr/bin/Xwayland 9MiB |
| 0 N/A N/A 20394 G /usr/bin/nautilus 80MiB |
| 0 N/A N/A 35817 G /usr/lib64/firefox/firefox 221MiB |
| 0 N/A N/A 63368 C python3.10 7396MiB |
| 0 N/A N/A 63662 C ...diffusion-webui/venv/bin/python3.10 116MiB |
+---------------------------------------------------------------------------------------+
Acquired VRAM after webui
is the first image generation job is started, but failed to be upscaled. We see that after the image generation task is stopped and not running anymore, acquired VRAM is constantly staying at 14 GB. Thus, no new image generation jobs can be submitted:
$ nvidia-smi
Sat Aug 12 01:40:05 2023
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.86.05 Driver Version: 535.86.05 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 Quadro RTX 5000 Off | 00000000:01:00.0 On | N/A |
| N/A 62C P8 10W / 90W | 15318MiB / 16384MiB | 1% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| 0 N/A N/A 2998 G /usr/bin/gnome-shell 450MiB |
| 0 N/A N/A 7491 G /usr/bin/Xwayland 9MiB |
| 0 N/A N/A 20394 G /usr/bin/nautilus 86MiB |
| 0 N/A N/A 35817 G /usr/lib64/firefox/firefox 238MiB |
| 0 N/A N/A 63368 C python3.10 14376MiB |
| 0 N/A N/A 63662 C ...diffusion-webui/venv/bin/python3.10 116MiB |
+---------------------------------------------------------------------------------------+
Acquired VRAM after webui
is stopped. We see if we stop webui
- VRAM is fully releasedm and can be reused:
$ nvidia-smi
Sat Aug 12 01:47:15 2023
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.86.05 Driver Version: 535.86.05 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 Quadro RTX 5000 Off | 00000000:01:00.0 On | N/A |
| N/A 53C P8 3W / 90W | 859MiB / 16384MiB | 2% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| 0 N/A N/A 2998 G /usr/bin/gnome-shell 454MiB |
| 0 N/A N/A 7491 G /usr/bin/Xwayland 9MiB |
| 0 N/A N/A 20394 G /usr/bin/nautilus 86MiB |
| 0 N/A N/A 35817 G /usr/lib64/firefox/firefox 252MiB |
+---------------------------------------------------------------------------------------+
Conclusion: It seems there is some kind of bug which doesn't release VRAM on failures. I'm also not sure why 7 GB of VRAM is acquired during initial application startup. I'm also not sure why another 7 GB of VRAM is acquired after the image generation process (total 14 GB of VRAM).
A simple solution for now: just stop and start webui
again. It will release all VRAM and you will be able to generate images again.
Same error here, it's just the opposite : RuntimeError: Input type (float) and bias type ([float](c10::Half)) should be the same]
.
Only when I use Hires fix, any model, any upscaler.
That's not a VRAM leak because it does it every time I try, even after computer restart.
My command line : COMMANDLINE_ARGS=--upcast-sampling --medvram --xformers --no-half-vae
I have changed torch from 1.13.1 to 2.0.1 and problem dissapeared!
My environment.yml
:
name: stable-diffusion-env
channels:
- defaults
dependencies:
- python=3.8 # Change to your desired Python version
- pip
- pip:
- ./invisible-watermark
- torch==2.0.1
- safetensors==0.3.3
- accelerate==0.22.0
- diffusers==0.20.2
- transformers==4.33.1
This issue still stands with AMD GPUs so I tried elsewhere... Wrote to ExLama they say the issue is HIP ( or perhaps ROCm ) : https://github.com/turboderp/exllama/issues/281 So here is my post on the Github HIP project in hopes they'll resolve : https://github.com/ROCm-Developer-Tools/HIP/issues/3331
I had fp16 error (fixed by using --precision full and disabling fp16) Then error mentioned in this thread appeared inserting the line of - torch==2.0.1 did not help I checked anaconda and I have 2.0.1 installed I have intel HD4000 on hp laptop and windows 8.1. any ideas anybody ? shall I try to force v2-inference.yaml instead of v2-inference-v one next ? if so. how do I do that ?
I got a similar error, but I was running it on Windows 10 instead of Mac:
RuntimeError: Input type (struct c10::Half) and bias type (float) should be the same
I tried removing --xformers
from the command line, adding --no-half
, --no-half-vae
, re-installing torch and restarting the computer, but they didn't work. Is there anybody can help? Thanks.
The command line:
--medvram --xformers --opt-split-attention --disable-nan-check
The whole log:
File "H:\AI\sd-webui\modules\call_queue.py", line 56, in f
res = list(func(*args, **kwargs))
File "H:\AI\sd-webui\modules\call_queue.py", line 37, in f
res = func(*args, **kwargs)
File "H:\AI\sd-webui\modules\txt2img.py", line 56, in txt2img
processed = process_images(p)
File "H:\AI\sd-webui\modules\processing.py", line 503, in process_images
res = process_images_inner(p)
File "H:\AI\sd-webui\modules\processing.py", line 653, in process_images_inner
samples_ddim = p.sample(conditioning=c, unconditional_conditioning=uc, seeds=seeds, subseeds=subseeds, subseed_strength=p.subseed_strength, prompts=prompts)
File "H:\AI\sd-webui\modules\processing.py", line 869, in sample
samples = self.sampler.sample(self, x, conditioning, unconditional_conditioning, image_conditioning=self.txt2img_image_conditioning(x))
File "H:\AI\sd-webui\modules\sd_samplers_kdiffusion.py", line 358, in sample
samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args={
File "H:\AI\sd-webui\modules\sd_samplers_kdiffusion.py", line 234, in launch_sampling
return func()
File "H:\AI\sd-webui\modules\sd_samplers_kdiffusion.py", line 358, in <lambda>
samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args={
File "H:\AI\sd-webui\venv\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "H:\AI\sd-webui\repositories\k-diffusion\k_diffusion\sampling.py", line 553, in sample_dpmpp_sde
denoised = model(x, sigmas[i] * s_in, **extra_args)
File "H:\AI\sd-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "H:\AI\sd-webui\modules\sd_samplers_kdiffusion.py", line 155, in forward
sd_samplers_common.store_latent(x_out[0:uncond.shape[0]])
File "H:\AI\sd-webui\modules\sd_samplers_common.py", line 58, in store_latent
shared.state.assign_current_image(sample_to_image(decoded))
File "H:\AI\sd-webui\modules\sd_samplers_common.py", line 46, in sample_to_image
return single_sample_to_image(samples[index], approximation)
File "H:\AI\sd-webui\modules\sd_samplers_common.py", line 35, in single_sample_to_image
x_sample = sd_vae_approx.model()(sample.to(devices.device, devices.dtype).unsqueeze(0))[0].detach()
File "H:\AI\sd-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "H:\AI\sd-webui\modules\sd_vae_approx.py", line 28, in forward
x = layer(x)
File "H:\AI\sd-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "H:\AI\sd-webui\extensions-builtin\Lora\lora.py", line 319, in lora_Conv2d_forward
return torch.nn.Conv2d_forward_before_lora(self, input)
File "H:\AI\sd-webui\venv\lib\site-packages\torch\nn\modules\conv.py", line 463, in forward
return self._conv_forward(input, self.weight, self.bias)
File "H:\AI\sd-webui\venv\lib\site-packages\torch\nn\modules\conv.py", line 459, in _conv_forward
return F.conv2d(input, weight, bias, self.stride,
RuntimeError: Input type (struct c10::Half) and bias type (float) should be the same
I think it's a wonder error because it ran successfully before, but today something goes wrong.
I solved my problem. It went wrong because I delete the model.pt
in VAE-approx
, and I restore it and the problem was solved.
When I tried generating an image first time, it says "No such file or directory: '...\\models\\VAE-approx\\model.pt'
" but I ignored it and tried generating again. Then it seems it changed to another error, which is like my last comment. But in fact that's because the file hasn't loaded yet.
So if somebody receives the same error, try checking if there's any "different" error informations before. That might be the ture reason for the error.
Same happened for me today
python scripts/txt2img.py --prompt "capybara" --ckpt .\checkpoints\512-base-ema.ckpt --n_samples 1 --precision=full --device=cpu --outdir ..\output
RuntimeError: Input type (struct c10::Half) and bias type (float) should be the same
I didn't use this project, however, when I use HuggingFace and THUDM/chatglm2-6b-int4,
model = AutoModel.from_pretrained("THUDM/chatglm2-6b-int4", trust_remote_code=True)
I encounter the similar error
File "/home/username/.cache/huggingface/modules/transformers_modules/chatglm2-6b-int4/quantization.py", line 248, in forward
output = inp.mm(weight.t())
RuntimeError: expected m1 and m2 to have the same dtype, but got: c10::Half != float
Then I tried and found the solution
model = AutoModel.from_pretrained("THUDM/chatglm2-6b-int4", trust_remote_code=True).cuda()
This may be somewhat related.
Traceback (most recent call last):
File "train_inpainting_dreambooth.py", line 876, in <module>
main(args)
File "train_inpainting_dreambooth.py", line 859, in main
save_weights(global_step)
File "train_inpainting_dreambooth.py", line 758, in save_weights
images = pipeline(
File "/usr/local/lib/python3.8/dist-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_inpaint.py", line 818, in __call__
mask, masked_image_latents = self.prepare_mask_latents(
File "/usr/local/lib/python3.8/dist-packages/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_inpaint.py", line 597, in prepare_mask_latents
masked_image_latents = self.vae.encode(masked_image).latent_dist.sample(generator=generator)
File "/usr/local/lib/python3.8/dist-packages/diffusers/utils/accelerate_utils.py", line 46, in wrapper
return method(self, *args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/diffusers/models/autoencoder_kl.py", line 164, in encode
h = self.encoder(x)
File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/diffusers/models/vae.py", line 109, in forward
sample = self.conv_in(sample)
File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/conv.py", line 460, in forward
return self._conv_forward(input, self.weight, self.bias)
File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/conv.py", line 456, in _conv_forward
return F.conv2d(input, weight, bias, self.stride,
RuntimeError: Input type (c10::Half) and bias type (float) should be the same
Steps: 40%|โโโโโโโโโโโโโ | 6000/15000 [1:16:40<1:55:00, 1.30it/s, loss=0.107, lr=2e-6]
Traceback (most recent call last):
File "/usr/local/bin/accelerate", line 8, in <module>
sys.exit(main())
File "/home/dblab/.local/lib/python3.8/site-packages/accelerate/commands/accelerate_cli.py", line 45, in main
args.func(args)
File "/home/dblab/.local/lib/python3.8/site-packages/accelerate/commands/launch.py", line 923, in launch_command
simple_launcher(args)
File "/home/dblab/.local/lib/python3.8/site-packages/accelerate/commands/launch.py", line 579, in simple_launcher
raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['/usr/bin/python3', 'train_inpainting_dreambooth.py', '--pretrained_model_name_or_path=runwayml/stable-diffusion-inpainting', '--pretrained_vae_name_or_path=stabilityai/sd-vae-ft-mse', '--output_dir=../../../../Realstc_Vision_profile_hyunwoo5_inpaint', '--with_prior_preservation', '--prior_loss_weight=1.0', '--seed=3434554', '--resolution=512', '--train_batch_size=2', '--train_text_encoder', '--mixed_precision=fp16', '--gradient_accumulation_steps=1', '--learning_rate=2e-6', '--lr_scheduler=constant', '--lr_warmup_steps=0', '--num_class_images=51', '--sample_batch_size=4', '--max_train_steps=15000', '--save_interval=1000', '--save_min_steps=6000', '--save_infer_steps=35', '--concepts_list=concepts_list.json', '--not_cache_latents', '--hflip']' returned non-zero exit status 1.
RuntimeError: Input type (c10::Half) and bias type (float) should be the same
This issue occurs in min_steps(6000step) while running well. It seems to be a problem when saving.
Remove '--mixed_precision=fp16' and Changing torch=2.0.2 are not useful for my case. And When I try to change Input type in python3.8/dist-packages/torch/nn/modules/conv.py, I received RuntimeError: Input type (float) and bias type (c10::Half) should be the same.
Is it Schrรถdinger's code?
Help me plz ๐ฅบ
fwiw I encountered this while using a1111, and restarting fixed it
I am using M1 Pro MacBook and I am trying to develop a stablediffusion using mps.
I changed the part about cuda to mps and changed it from ddim.py to float32 because mps did not support float64.
def register_buffer(self, name, attr): if type(attr) == torch.Tensor: if attr.device != torch.device("mps"): attr = attr.to(torch.float32).to(torch.device("mps")) setattr(self, name, attr)
def make_schedule(self, ddim_num_steps, ddim_discretize="uniform", ddim_eta=0., verbose=True): self.ddim_timesteps = make_ddim_timesteps(ddim_discr_method=ddim_discretize, num_ddim_timesteps=ddim_num_steps, num_ddpm_timesteps=self.ddpm_num_timesteps,verbose=verbose) alphas_cumprod = self.model.alphas_cumprod assert alphas_cumprod.shape[0] == self.ddpm_num_timesteps, 'alphas have to be defined for each timestep' to_torch = lambda x: x.clone().detach().to(torch.float32).to(self.model.device)
Since then, this problem has occurred in conv.py
def _conv_forward(self, input: Tensor, weight: Tensor, bias: Optional[Tensor]): if self.padding_mode != 'zeros': return F.conv2d(F.pad(input, self._reversed_padding_repeated_twice, mode=self.padding_mode), weight, bias, self.stride, _pair(0), self.dilation, self.groups) return F.conv2d(input, weight, bias, self.stride, self.padding, self.dilation, self.groups)
def forward(self, input: Tensor) -> Tensor: return self._conv_forward(input, self.weight, self.bias)
Help me please.