Stability-AI / stablediffusion

High-Resolution Image Synthesis with Latent Diffusion Models
MIT License
39.19k stars 5.05k forks source link

RuntimeError: expected scalar type BFloat16 but found Float #236

Open picard314 opened 1 year ago

picard314 commented 1 year ago

Below is the log I have encountered at running "python scripts/txt2img.py --prompt "a professional photograph of an astronaut riding a horse" --ckpt <path/to/768model.ckpt/> --config configs/stable-diffusion/v2-inference-v.yaml --H 768 --W 768"

Running DDIM Sampling with 50 timesteps DDIM Sampler: 0%| | 0/50 [00:00<?, ?it/s] data: 0%| | 0/1 [00:00<?, ?it/s] Sampling: 0%| | 0/3 [00:00<?, ?it/s] Traceback (most recent call last): File "scripts/txt2img.py", line 388, in main(opt) File "scripts/txt2img.py", line 347, in main samples, _ = sampler.sample(S=opt.steps, File "/root/miniconda3/envs/ldmsd/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context return func(*args, kwargs) File "/mnt/disk1/swh/git_sd/stablediffusion/ldm/models/diffusion/ddim.py", line 104, in sample samples, intermediates = self.ddim_sampling(conditioning, size, File "/root/miniconda3/envs/ldmsd/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context return func(*args, *kwargs) File "/mnt/disk1/swh/git_sd/stablediffusion/ldm/models/diffusion/ddim.py", line 164, in ddim_sampling outs = self.p_sample_ddim(img, cond, ts, index=index, use_original_steps=ddim_use_original_steps, File "/root/miniconda3/envs/ldmsd/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context return func(args, kwargs) File "/mnt/disk1/swh/git_sd/stablediffusion/ldm/models/diffusion/ddim.py", line 212, in p_sample_ddim model_uncond, model_t = self.model.apply_model(x_in, t_in, c_in).chunk(2) File "/mnt/disk1/swh/git_sd/stablediffusion/ldm/models/diffusion/ddpm.py", line 858, in apply_model x_recon = self.model(x_noisy, t, cond) File "/root/miniconda3/envs/ldmsd/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, *kwargs) File "/mnt/disk1/swh/git_sd/stablediffusion/ldm/models/diffusion/ddpm.py", line 1335, in forward out = self.diffusion_model(x, t, context=cc) File "/root/miniconda3/envs/ldmsd/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(input, kwargs) File "/mnt/disk1/swh/git_sd/stablediffusion/ldm/modules/diffusionmodules/openaimodel.py", line 797, in forward h = module(h, emb, context) File "/root/miniconda3/envs/ldmsd/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, kwargs) File "/mnt/disk1/swh/git_sd/stablediffusion/ldm/modules/diffusionmodules/openaimodel.py", line 84, in forward x = layer(x, context) File "/root/miniconda3/envs/ldmsd/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, *kwargs) File "/mnt/disk1/swh/git_sd/stablediffusion/ldm/modules/attention.py", line 327, in forward x = self.norm(x) File "/root/miniconda3/envs/ldmsd/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(input, kwargs) File "/root/miniconda3/envs/ldmsd/lib/python3.8/site-packages/torch/nn/modules/normalization.py", line 272, in forward return F.group_norm( File "/root/miniconda3/envs/ldmsd/lib/python3.8/site-packages/torch/nn/functional.py", line 2516, in group_norm return torch.group_norm(input, num_groups, weight, bias, eps, torch.backends.cudnn.enabled) RuntimeError: expected scalar type BFloat16 but found Float

Please, anyone has met the same and had a solution?

simonnxren commented 1 year ago

have you solved the issue?

picard314 commented 1 year ago

have you solved the issue?

Yes I have. It is due to the incompatiblity of pytorch with cuda.

adirz commented 1 year ago

have you solved the issue?

Yes I have. It is due to the incompatiblity of pytorch with cuda.

I am facing the same issue myself. Is it incompatible with cuda et al, or a version of it? because I have a hard time imagining running it without using the gpu. how did you fix it?

picard314 commented 1 year ago

Embarrassingly, I have turned to use the gpu to circumvent such issue. The incompatiblity was in fact a problem I met when I used gpu. I have not seen into "using the cpu" any more but I guess changing torch version may help you @adirz .

@simonnxren Sorry for giving vague answer to you.

wobblytables commented 1 year ago

@picard314 I have run into this issue, but I was able to make adjustments so that the code runs, but it's using my CPU and not my NVIDIA GPU. I'm running CUDA 11.7 as that is what seemed to be the correct version. What CUDA version are you using, what all did you do to resolve this issue?

picard314 commented 1 year ago

@wobblytables mine is cuda 11.4

If for cuda 11.7, I think installation needs to be

conda install pytorch==1.13.1 torchvision==0.14.1 torchaudio==0.13.1 pytorch-cuda=11.7 -c pytorch -c nvidia

lijain commented 1 year ago

Yes I have. It is due to the incompatiblity of pytorch with cuda

I had the same problem and solved it by setting up the gpu to run

320010ly commented 1 year ago

Yes I have. It is due to the incompatiblity of pytorch with cuda

I had the same problem and solved it by setting up the gpu to run

I met with the same problem.Are you mean to use methods like set CUDA_VISIBLE_DEVICES to set up the gpu?Thank you very much

hotelbread commented 1 year ago

@wobblytables mine is cuda 11.4

If for cuda 11.7, I think installation needs to be

conda install pytorch==1.13.1 torchvision==0.14.1 torchaudio==0.13.1 pytorch-cuda=11.7 -c pytorch -c nvidia

if you don't mind, can I know your GPU name and which version of pytorch you used? I have geforce3060, and I used cuda 11.4, pytorch 1.12.1 but I met that error so I changed the cuda version to 11.6 but still have a same problem...

Mateusmsouza commented 1 year ago

Adding --device cuda worked for me.

It looks like a change setted CPU to be used by default https://github.com/Stability-AI/stablediffusion/pull/147/files#diff-048b7bba4049f97b2038502af5686b6c5f53a882ff02771fcb0d733d22a0ab6cR180-R186, I think it was messing up data types.

order-a-lemonade commented 1 year ago

Adding --device cuda worked for me.

It looks like a change setted CPU to be used by default https://github.com/Stability-AI/stablediffusion/pull/147/files#diff-048b7bba4049f97b2038502af5686b6c5f53a882ff02771fcb0d733d22a0ab6cR180-R186, I think it was messing up data types.

nice solution,it's worked for me too

asdfjkluiop commented 10 months ago

How do you fix this error when you actually want to run it on the CPU? I can't find a way to

esiefker commented 9 months ago

Is this going to get fixed?

I read the documentation, installed the requirements, and ran the example. It crashed with this error message.

That seems like a pretty critical bug, but it hasn't even been assigned to anyone yet after 9 months.

questor commented 9 months ago

as a hint, here is some description what might help: use "--precision full" (taken from here: https://huggingface.co/CompVis/stable-diffusion-v1-4/discussions/42) and in addition there are special configs for cpu processing in the "intel" folder of this repo. Currently I'm using the "-fp32" config in combination with the precision flag and it at least generates some images. I'm not sure what the root-cause really is as I'm no expert in this field, but this https://github.com/Stability-AI/stablediffusion/blob/main/ldm/modules/attention.py#L175 looks suspicious...

SofiaBianchi123 commented 7 months ago

i am having the same issue

!pip install torch==2.0.1 transformers datasets peft accelerate trl bitsandbytes optimum

when i try to load the X_IA3 adapters