[Bug]: Remove of --no-half cause errors under MacOS with any Torch version, but almost all samplers produce only noise with it and latest nightly builds

Vitkarus commented 1 year ago

Is there an existing issue for this?

[X] I have searched the existing issues and checked the recent builds/commits

What happened?

I've tested different versions of Torch to possibly find one that works with --no-half but no luck.

1.14.0.dev20221025 I'm currently using works fine but throws errors without the --ho-half argument. The latest nightly version 2.1.0.dev20230312 seems to work with this argument and gives a really noticeable performance boost, but almost all samplers break on it.

My results

With --no-half: There are no errors, but all samplers apart from DDIM and PLMS produce only noise as final results, these two gives out normal pictures. Also new UniPC produce something that looks like a bit less then noise, but still really messy.

Without --no-half: Errors while using everything except DDIM and PLMS. They also works around 40% faster then with --no-half.

Without --no-half and with --disable-nan-check: Just black images instead of noise.

Steps to reproduce the problem

I was just changing startup arguments

What should have happened?

Other samplers should work too, I guess

Commit where the problem happens

3c922d98

What platforms do you use to access the UI ?

MacOS

What browsers do you use to access the UI ?

Mozilla Firefox

Command Line Arguments

--opt-sub-quad-attention --skip-torch-cuda-test --upcast-sampling --use-cpu interrogate --no-half

List of extensions

No

Console logs

Error completing request
Arguments: ('task(9rr6te8wtxyte2o)', 'watermelon', '', [], 20, 16, False, False, 1, 1, 7, -1.0, -1.0, 0, 0, 0, False, 512, 512, False, 0.7, 2, 'Latent', 0, 0, 0, [], 0, False, False, 'positive', 'comma', 0, False, False, '', 1, '', 0, '', 0, '', True, False, False, False, 0) {}
Traceback (most recent call last):
  File "/stable-diffusion-webui/modules/call_queue.py", line 56, in f
    res = list(func(*args, **kwargs))
  File "/stable-diffusion-webui/modules/call_queue.py", line 37, in f
    res = func(*args, **kwargs)
  File "/stable-diffusion-webui/modules/txt2img.py", line 56, in txt2img
    processed = process_images(p)
  File "/stable-diffusion-webui/modules/processing.py", line 486, in process_images
    res = process_images_inner(p)
  File "/stable-diffusion-webui/modules/processing.py", line 635, in process_images_inner
    samples_ddim = p.sample(conditioning=c, unconditional_conditioning=uc, seeds=seeds, subseeds=subseeds, subseed_strength=p.subseed_strength, prompts=prompts)
  File "/stable-diffusion-webui/modules/processing.py", line 835, in sample
    samples = self.sampler.sample(self, x, conditioning, unconditional_conditioning, image_conditioning=self.txt2img_image_conditioning(x))
  File "/stable-diffusion-webui/modules/sd_samplers_kdiffusion.py", line 351, in sample
    samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args={
  File "/stable-diffusion-webui/modules/sd_samplers_kdiffusion.py", line 227, in launch_sampling
    return func()
  File "/stable-diffusion-webui/modules/sd_samplers_kdiffusion.py", line 351, in <lambda>
    samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args={
  File "/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/stable-diffusion-webui/repositories/k-diffusion/k_diffusion/sampling.py", line 553, in sample_dpmpp_sde
    denoised = model(x, sigmas[i] * s_in, **extra_args)
  File "/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/stable-diffusion-webui/modules/sd_samplers_kdiffusion.py", line 145, in forward
    devices.test_for_nans(x_out, "unet")
  File "/stable-diffusion-webui/modules/devices.py", line 152, in test_for_nans
    raise NansException(message)

modules.devices.NansException: A tensor with all NaNs was produced in Unet. This could be either because there's not enough precision to represent the picture, or because your video card does not support half type. Try setting the "Upcast cross attention layer to float32" option in Settings > Stable Diffusion or using the --no-half commandline argument to fix this. Use --disable-nan-check commandline argument to disable this check.

Additional information

Intel Mac with RX 6600XT, MacOS 13.2.1

Homemaderobot commented 1 year ago

I'm getting similar on my 32GB M1 Max - MacOS 13.2. Torch v1.12.1 Commit a9fed7c

Default Command Line Arguments: --upcast-sampling --no-half-vae --use-cpu interrogate

Problem is with v2-1_768 (2.0, 1.5 & 1.4 work fine). With SD2.1 it errors at 0% with all sampling methods (except DDIM, PLMS & UniPC):

Error completing request
Arguments: ('task(ou61msdjo5m7nj1)', 'photo of a man', '', [], 10, 15, False, False, 1, 1, 7, -1.0, -1.0, 0, 0, 0, False, 768, 768, False, 0.7, 2, 'Latent', 0, 0, 0, [], 0, False, False, 'positive', 'comma', 0, False, False, '', 1, '', 0, '', 0, '', True, False, False, False, 0) {}
Traceback (most recent call last):
  File "/Users/js/stable-diffusion-webui/modules/call_queue.py", line 56, in f
    res = list(func(*args, **kwargs))
  File "/Users/js/stable-diffusion-webui/modules/call_queue.py", line 37, in f
    res = func(*args, **kwargs)
  File "/Users/js/stable-diffusion-webui/modules/txt2img.py", line 56, in txt2img
    processed = process_images(p)
  File "/Users/js/stable-diffusion-webui/modules/processing.py", line 486, in process_images
    res = process_images_inner(p)
  File "/Users/js/stable-diffusion-webui/modules/processing.py", line 636, in process_images_inner
    samples_ddim = p.sample(conditioning=c, unconditional_conditioning=uc, seeds=seeds, subseeds=subseeds, subseed_strength=p.subseed_strength, prompts=prompts)
  File "/Users/js/stable-diffusion-webui/modules/processing.py", line 836, in sample
    samples = self.sampler.sample(self, x, conditioning, unconditional_conditioning, image_conditioning=self.txt2img_image_conditioning(x))
  File "/Users/js/stable-diffusion-webui/modules/sd_samplers_kdiffusion.py", line 351, in sample
    samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args={
  File "/Users/js/stable-diffusion-webui/modules/sd_samplers_kdiffusion.py", line 227, in launch_sampling
    return func()
  File "/Users/js/stable-diffusion-webui/modules/sd_samplers_kdiffusion.py", line 351, in <lambda>
    samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args={
  File "/Users/js/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/Users/js/stable-diffusion-webui/repositories/k-diffusion/k_diffusion/sampling.py", line 594, in sample_dpmpp_2m
    denoised = model(x, sigmas[i] * s_in, **extra_args)
  File "/Users/js/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/Users/js/stable-diffusion-webui/modules/sd_samplers_kdiffusion.py", line 145, in forward
    devices.test_for_nans(x_out, "unet")
  File "/Users/js/stable-diffusion-webui/modules/devices.py", line 152, in test_for_nans
    raise NansException(message)
modules.devices.NansException: A tensor with all NaNs was produced in Unet. This could be either because there's not enough precision to represent the picture, or because your video card does not support half type. Try setting the "Upcast cross attention layer to float32" option in Settings > Stable Diffusion or using the --no-half commandline argument to fix this. Use --disable-nan-check commandline argument to disable this check.

With DDIM, PLMS & UniPC it gets to 100% with black square and no saved image:

Error completing request:41,  1.36s/it]
Arguments: ('task(ao35ifugi48be7m)', 'photo of a man', '', [], 10, 19, False, False, 1, 1, 7, -1.0, -1.0, 0, 0, 0, False, 768, 768, False, 0.7, 2, 'Latent', 0, 0, 0, [], 0, False, False, 'positive', 'comma', 0, False, False, '', 1, '', 0, '', 0, '', True, False, False, False, 0) {}
Traceback (most recent call last):
  File "/Users/js/stable-diffusion-webui/modules/call_queue.py", line 56, in f
    res = list(func(*args, **kwargs))
  File "/Users/js/stable-diffusion-webui/modules/call_queue.py", line 37, in f
    res = func(*args, **kwargs)
  File "/Users/js/stable-diffusion-webui/modules/txt2img.py", line 56, in txt2img
    processed = process_images(p)
  File "/Users/js/stable-diffusion-webui/modules/processing.py", line 486, in process_images
    res = process_images_inner(p)
  File "/Users/js/stable-diffusion-webui/modules/processing.py", line 640, in process_images_inner
    devices.test_for_nans(x, "vae")
  File "/Users/js/stable-diffusion-webui/modules/devices.py", line 152, in test_for_nans
    raise NansException(message)
modules.devices.NansException: A tensor with all NaNs was produced in VAE. Use --disable-nan-check commandline argument to disable this check.

Have tried with all extensions off and have deleted /venv directory and get same results.

Have been absorbed in controlnet with SD1.5 for a while so not sure how long has been a problem.

Thanks.

yuhuihu commented 1 year ago

I'm getting similar on my 32GB M1 Max - MacOS 13.2. Torch v1.12.1 Commit a9fed7c

Default Command Line Arguments: --upcast-sampling --no-half-vae --use-cpu interrogate

Problem is with v2-1_768 (2.0, 1.5 & 1.4 work fine). With SD2.1 it errors at 0% with all sampling methods (except DDIM, PLMS & UniPC):

Error completing request
Arguments: ('task(ou61msdjo5m7nj1)', 'photo of a man', '', [], 10, 15, False, False, 1, 1, 7, -1.0, -1.0, 0, 0, 0, False, 768, 768, False, 0.7, 2, 'Latent', 0, 0, 0, [], 0, False, False, 'positive', 'comma', 0, False, False, '', 1, '', 0, '', 0, '', True, False, False, False, 0) {}
Traceback (most recent call last):
  File "/Users/js/stable-diffusion-webui/modules/call_queue.py", line 56, in f
    res = list(func(*args, **kwargs))
  File "/Users/js/stable-diffusion-webui/modules/call_queue.py", line 37, in f
    res = func(*args, **kwargs)
  File "/Users/js/stable-diffusion-webui/modules/txt2img.py", line 56, in txt2img
    processed = process_images(p)
  File "/Users/js/stable-diffusion-webui/modules/processing.py", line 486, in process_images
    res = process_images_inner(p)
  File "/Users/js/stable-diffusion-webui/modules/processing.py", line 636, in process_images_inner
    samples_ddim = p.sample(conditioning=c, unconditional_conditioning=uc, seeds=seeds, subseeds=subseeds, subseed_strength=p.subseed_strength, prompts=prompts)
  File "/Users/js/stable-diffusion-webui/modules/processing.py", line 836, in sample
    samples = self.sampler.sample(self, x, conditioning, unconditional_conditioning, image_conditioning=self.txt2img_image_conditioning(x))
  File "/Users/js/stable-diffusion-webui/modules/sd_samplers_kdiffusion.py", line 351, in sample
    samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args={
  File "/Users/js/stable-diffusion-webui/modules/sd_samplers_kdiffusion.py", line 227, in launch_sampling
    return func()
  File "/Users/js/stable-diffusion-webui/modules/sd_samplers_kdiffusion.py", line 351, in <lambda>
    samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args={
  File "/Users/js/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/Users/js/stable-diffusion-webui/repositories/k-diffusion/k_diffusion/sampling.py", line 594, in sample_dpmpp_2m
    denoised = model(x, sigmas[i] * s_in, **extra_args)
  File "/Users/js/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/Users/js/stable-diffusion-webui/modules/sd_samplers_kdiffusion.py", line 145, in forward
    devices.test_for_nans(x_out, "unet")
  File "/Users/js/stable-diffusion-webui/modules/devices.py", line 152, in test_for_nans
    raise NansException(message)
modules.devices.NansException: A tensor with all NaNs was produced in Unet. This could be either because there's not enough precision to represent the picture, or because your video card does not support half type. Try setting the "Upcast cross attention layer to float32" option in Settings > Stable Diffusion or using the --no-half commandline argument to fix this. Use --disable-nan-check commandline argument to disable this check.

With DDIM, PLMS & UniPC it gets to 100% with black square and no saved image:

Error completing request:41,  1.36s/it]
Arguments: ('task(ao35ifugi48be7m)', 'photo of a man', '', [], 10, 19, False, False, 1, 1, 7, -1.0, -1.0, 0, 0, 0, False, 768, 768, False, 0.7, 2, 'Latent', 0, 0, 0, [], 0, False, False, 'positive', 'comma', 0, False, False, '', 1, '', 0, '', 0, '', True, False, False, False, 0) {}
Traceback (most recent call last):
  File "/Users/js/stable-diffusion-webui/modules/call_queue.py", line 56, in f
    res = list(func(*args, **kwargs))
  File "/Users/js/stable-diffusion-webui/modules/call_queue.py", line 37, in f
    res = func(*args, **kwargs)
  File "/Users/js/stable-diffusion-webui/modules/txt2img.py", line 56, in txt2img
    processed = process_images(p)
  File "/Users/js/stable-diffusion-webui/modules/processing.py", line 486, in process_images
    res = process_images_inner(p)
  File "/Users/js/stable-diffusion-webui/modules/processing.py", line 640, in process_images_inner
    devices.test_for_nans(x, "vae")
  File "/Users/js/stable-diffusion-webui/modules/devices.py", line 152, in test_for_nans
    raise NansException(message)
modules.devices.NansException: A tensor with all NaNs was produced in VAE. Use --disable-nan-check commandline argument to disable this check.

Have tried with all extensions off and have deleted /venv directory and get same results.

Have been absorbed in controlnet with SD1.5 for a while so not sure how long has been a problem.

Thanks.

Totally , the same issue on my device.

Scaperista commented 1 year ago

I have the same problem after installing the fresh torch 2.0 GA, without --no-halfs I get the input types 'tensor<1x77x1xf16>' and 'tensor<1xf32>' are not broadcast compatible and with --no-halfs most of the samplers just produce some gibberish images with noise. This is on an non M1 Mac running macOS 12.6.3. Before the torch upgrade everything worked quite well aside from some random MPS problems regarding memory watermarks.

brkirch commented 1 year ago

1.14.0.dev20221025 I'm currently using works fine but throws errors without the --ho-half argument.

So all samplers work correctly with that build? Would you be able to test builds from the dates in between that and 2.1.0.dev20230312 to determine the latest build that works correctly? I haven’t been able to reproduce this issue but if I knew the exact date of the last working PyTorch build then I could likely determine more about this issue and make a workaround.

Edit: Actually, if you could give me the traceback from the error you get from using 1.14.0.dev20221025 without --no-half, that may give me a better idea of what is going wrong.

Vitkarus commented 1 year ago

@brkirch I'm not entirely sure how to use it with 1.14.0 now because it's installed in my Conda environment and the WebUI always tries to create its own venv and install dependencies there. Maybe I can somehow force WebUI to skip creating another venv? Before that I used the old version, which I deleted

Scaperista commented 1 year ago

So just tested some more torch versions from nightly and the earliest versio (torch-2.0.0.dev20230128 torchvision-0.15.0.dev20230128) showed the same effects for me, as the latest one (torch-2.1.0.dev20230327 torchvision-0.16.0.dev2023032). Without --no-halfs I get for the latest nightly: RuntimeError: "upsample_nearest2d_channels_last" not implemented for 'Half' and with --no-halfs only noise is generated as image e.g. when using the euler sampler.

Awethon commented 1 year ago

@brkirch Same for me, just pulled master branch and updated dependencies. webui config is default 2.1_768 model, stable torch 2.0, M1 Max.

For DPM++ 2M/SDE Karras/non-Karras:

  |  File "/Users/username/Documents/stable-diffusion-webui/modules/txt2img.py", line 56, in txt2img
  |    processed = process_images(p)
  |  File "/Users/username/Documents/stable-diffusion-webui/modules/processing.py", line 503, in process_images
  |    res = process_images_inner(p)
  |  File "/Users/username/Documents/stable-diffusion-webui/modules/processing.py", line 653, in process_images_inner
  |    samples_ddim = p.sample(conditioning=c, unconditional_conditioning=uc, seeds=seeds, subseeds=subseeds, subseed_strength=p.subseed_strength, prompts=prompts)
  |  File "/Users/username/Documents/stable-diffusion-webui/modules/processing.py", line 869, in sample
  |    samples = self.sampler.sample(self, x, conditioning, unconditional_conditioning, image_conditioning=self.txt2img_image_conditioning(x))
  |  File "/Users/username/Documents/stable-diffusion-webui/modules/sd_samplers_kdiffusion.py", line 358, in sample
  |    samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args={
  |  File "/Users/username/Documents/stable-diffusion-webui/modules/sd_samplers_kdiffusion.py", line 234, in launch_sampling
  |    return func()
  |  File "/Users/username/Documents/stable-diffusion-webui/modules/sd_samplers_kdiffusion.py", line 358, in <lambda>
  |    samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args={
  |  File "/Users/username/Documents/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
  |    return func(*args, **kwargs)
  |  File "/Users/username/Documents/stable-diffusion-webui/repositories/k-diffusion/k_diffusion/sampling.py", line 128, in sample_euler
  |    denoised = model(x, sigma_hat * s_in, **extra_args)
  |  File "/Users/username/Documents/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
  |    return forward_call(*args, **kwargs)
  |  File "/Users/username/Documents/stable-diffusion-webui/modules/sd_samplers_kdiffusion.py", line 152, in forward
  |    devices.test_for_nans(x_out, "unet")
  |  File "/Users/username/Documents/stable-diffusion-webui/modules/devices.py", line 152, in test_for_nans
  |    raise NansException(message)
  |modules.devices.NansException: A tensor with all NaNs was produced in Unet. This could be either because there's not enough precision to represent the picture, or because your video card does not support half type. Try setting the "Upcast cross attention layer to float32" option in Settings > Stable Diffusion or using the --no-half commandline argument to fix this. Use --disable-nan-check commandline argument to disable this check.

For DDIM:

  File "/Users/username/Documents/stable-diffusion-webui/modules/txt2img.py", line 56, in txt2img
    processed = process_images(p)
  File "/Users/username/Documents/stable-diffusion-webui/modules/processing.py", line 503, in process_images
    res = process_images_inner(p)
  File "/Users/username/Documents/stable-diffusion-webui/modules/processing.py", line 657, in process_images_inner
    devices.test_for_nans(x, "vae")
  File "/Users/username/Documents/stable-diffusion-webui/modules/devices.py", line 152, in test_for_nans
    raise NansException(message)
modules.devices.NansException: A tensor with all NaNs was produced in VAE. Use --disable-nan-check commandline argument to disable this check.

imtiger commented 1 year ago

I have the same problem after installing the fresh torch 2.0 GA, without --no-halfs I get the input types 'tensor<1x77x1xf16>' and 'tensor<1xf32>' are not broadcast compatible and with --no-halfs most of the samplers just produce some gibberish images with noise. This is on an non M1 Mac running macOS 12.6.3. Before the torch upgrade everything worked quite well aside from some random MPS problems regarding memory watermarks.

me too

imtiger commented 1 year ago

So just tested some more torch versions from nightly and the earliest versio (torch-2.0.0.dev20230128 torchvision-0.15.0.dev20230128) showed the same effects for me, as the latest one (torch-2.1.0.dev20230327 torchvision-0.16.0.dev2023032). Without --no-halfs I get for the latest nightly: RuntimeError: "upsample_nearest2d_channels_last" not implemented for 'Half' and with --no-halfs only noise is generated as image e.g. when using the euler sampler.

have you solved this ? i also have the problem

GoJerry commented 1 year ago

RuntimeError

I also met the same problem, can you tell me you solved? thanks Macos M1

GoJerry commented 1 year ago

So just tested some more torch versions from nightly and the earliest versio (torch-2.0.0.dev20230128 torchvision-0.15.0.dev20230128) showed the same effects for me, as the latest one (torch-2.1.0.dev20230327 torchvision-0.16.0.dev2023032). Without --no-halfs I get for the latest nightly: RuntimeError: "upsample_nearest2d_channels_last" not implemented for 'Half' and with --no-halfs only noise is generated as image e.g. when using the euler sampler.

have you solved this ? i also have the problem

me too!! how to solve this?

hstk30 commented 1 year ago

Error completing request
Arguments: ('task(vim8g0n4kh0utdt)', 'create a classic woman with the pearl necklace', '', [], 20, 0, False, False, 1, 1, 7, -1.0, -1.0, 0, 0, 0, False, 512, 512, False, 0.7, 2, 'Latent', 0, 0, 0, [], 0, False, False, 'positive', 'comma', 0, False, False, '', 1, '', 0, '', 0, '', True, False, False, False, 0) {}
Traceback (most recent call last):
  File "/Users/hstk/code_ground/source_code/stable-diffusion-webui/modules/call_queue.py", line 56, in f
    res = list(func(*args, **kwargs))
  File "/Users/hstk/code_ground/source_code/stable-diffusion-webui/modules/call_queue.py", line 37, in f
    res = func(*args, **kwargs)
  File "/Users/hstk/code_ground/source_code/stable-diffusion-webui/modules/txt2img.py", line 56, in txt2img
    processed = process_images(p)
  File "/Users/hstk/code_ground/source_code/stable-diffusion-webui/modules/processing.py", line 503, in process_images
    res = process_images_inner(p)
  File "/Users/hstk/code_ground/source_code/stable-diffusion-webui/modules/processing.py", line 653, in process_images_inner
    samples_ddim = p.sample(conditioning=c, unconditional_conditioning=uc, seeds=seeds, subseeds=subseeds, subseed_strength=p.subseed_strength, prompts=prompts)
  File "/Users/hstk/code_ground/source_code/stable-diffusion-webui/modules/processing.py", line 869, in sample
    samples = self.sampler.sample(self, x, conditioning, unconditional_conditioning, image_conditioning=self.txt2img_image_conditioning(x))
  File "/Users/hstk/code_ground/source_code/stable-diffusion-webui/modules/sd_samplers_kdiffusion.py", line 358, in sample
    samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args={
  File "/Users/hstk/code_ground/source_code/stable-diffusion-webui/modules/sd_samplers_kdiffusion.py", line 234, in launch_sampling
    return func()
  File "/Users/hstk/code_ground/source_code/stable-diffusion-webui/modules/sd_samplers_kdiffusion.py", line 358, in <lambda>
    samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args={
  File "/Users/hstk/code_ground/source_code/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/Users/hstk/code_ground/source_code/stable-diffusion-webui/repositories/k-diffusion/k_diffusion/sampling.py", line 145, in sample_euler_ancestral
    denoised = model(x, sigmas[i] * s_in, **extra_args)
  File "/Users/hstk/code_ground/source_code/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/Users/hstk/code_ground/source_code/stable-diffusion-webui/modules/sd_samplers_kdiffusion.py", line 152, in forward
    devices.test_for_nans(x_out, "unet")
  File "/Users/hstk/code_ground/source_code/stable-diffusion-webui/modules/devices.py", line 152, in test_for_nans
    raise NansException(message)
modules.devices.NansException: A tensor with all NaNs was produced in Unet. This could be either because there's not enough precision to represent the picture, or because your video card does not support half type. Try setting the "Upcast cross attention layer to float32" option in Settings > Stable Diffusion or using the --no-half commandline argument to fix this. Use --disable-nan-check commandline argument to disable this check.

Me too! In my Apple M1 Pro 16G.

FreeBlues commented 1 year ago

Try torch 1.12.1 Someone said that the 1.12.1 is the only version can work on Mac, other version all have issue.

hstk30 commented 1 year ago

Try torch 1.12.1 Someone said that the 1.12.1 is the only version can work on Mac, other version all have issue.

Still not work for me.

digital-pers0n commented 1 year ago

Try torch 1.12.1 Someone said that the 1.12.1 is the only version can work on Mac, other version all have issue.

I think there is something broken in Ventura. I can use torch 1.12 and 1.13 with no problems in Monterey (I tested rx 570 8Gb and rx 6800 xt), also torch 2.x works correctly if you launch SD with the --cpu all . In Ventura nothing is working for me.

ZhelenZ commented 1 year ago

It may relate to the nightly PyTorch version. I encountered the same issue and tried revert back to my older version and problem solved! Check out this link below. It well explained how and why : ) https://github.com/AUTOMATIC1111/stable-diffusion-webui/discussions/7453#discussion-4812369

congzai520 commented 1 year ago

我也是

ZXBmmt commented 1 year ago

edit “stable-diffusion-webui/webui-user.sh”

find COMMANDLINE_ARGS variable

Use the following configuration： export COMMANDLINE_ARGS="--skip-torch-cuda-test --upcast-sampling --opt-split-attention --lowvram --no-half --use-cpu all"

oivoodoo commented 1 year ago

Hi @ZXBmmt

it worked for me!

Thank you!

(Apple M1 Pro)

Vitkarus commented 1 year ago

I updated to MacOS 13.3.1 and installed latest commit [5ab7f213], but unfortunately things got even worse. Now even the DDIM sampler produces wrong images without --no-half

iWooda commented 1 year ago

edit “stable-diffusion-webui/webui-user.sh”

find COMMANDLINE_ARGS variable

Use the following configuration： export COMMANDLINE_ARGS="--skip-torch-cuda-test --upcast-sampling --opt-split-attention --lowvram --no-half --use-cpu all"

It's not worked。。

ctrlaltdylan commented 1 year ago

Same issue, MacOS Monterey (12.6.2 (21G320)) on master branch.

huanDreamer commented 1 year ago

edit “stable-diffusion-webui/webui-user.sh”

find COMMANDLINE_ARGS variable

Use the following configuration： export COMMANDLINE_ARGS="--skip-torch-cuda-test --upcast-sampling --opt-split-attention --lowvram --no-half --use-cpu all"

it works for macbook m1, thanks

finalcolor commented 1 year ago

Open webui-macos-env.sh file with your textedit.

Change : export COMMANDLINE_ARGS="--skip-torch-cuda-test --upcast-sampling --no-half-vae --use-cpu interrogate"

To : export COMMANDLINE_ARGS="--skip-torch-cuda-test --upcast-sampling --no-half-vae --no-half --use-cpu interrogate"

iWooda commented 1 year ago

Open webui-macos-env.sh file with your textedit.

Change : export COMMANDLINE_ARGS="--skip-torch-cuda-test --upcast-sampling --no-half-vae --use-cpu interrogate"

To : export COMMANDLINE_ARGS="--skip-torch-cuda-test --upcast-sampling --no-half-vae --no-half --use-cpu interrogate"

Thanks, It worked. There is Another solution without change any files, That is use the Start command line: ./webui.sh --precision full --no-half Also worked for me.

finalcolor commented 1 year ago

great, you re very clever!

finalcolor commented 1 year ago

So just tested some more torch versions from nightly and the earliest versio (torch-2.0.0.dev20230128 torchvision-0.15.0.dev20230128) showed the same effects for me, as the latest one (torch-2.1.0.dev20230327 torchvision-0.16.0.dev2023032). Without --no-halfs I get for the latest nightly: RuntimeError: "upsample_nearest2d_channels_last" not implemented for 'Half' and with --no-halfs only noise is generated as image e.g. when using the euler sampler.

have you solved this ? i also have the problem

Open webui-macos-env.sh file with your textedit.

Change : export COMMANDLINE_ARGS="--skip-torch-cuda-test --upcast-sampling --no-half-vae --use-cpu interrogate"

To : export COMMANDLINE_ARGS="--skip-torch-cuda-test --upcast-sampling --no-half-vae --no-half --use-cpu interrogate"

第二种解决方案： That is use the Start command line: ./webui.sh --precision full --no-half

Muscleape commented 1 year ago

edit “stable-diffusion-webui/webui-user.sh”

find COMMANDLINE_ARGS variable

Use the following configuration： export COMMANDLINE_ARGS="--skip-torch-cuda-test --upcast-sampling --opt-split-attention --lowvram --no-half --use-cpu all"

Hi @ZXBmmt

it worked for me （MacOS 13.2.1 16G commit baf6946e）

Thank you!

(Apple M1 Pro)

howyeah commented 1 year ago

Open webui-macos-env.sh file with your textedit. Change : export COMMANDLINE_ARGS="--skip-torch-cuda-test --upcast-sampling --no-half-vae --use-cpu interrogate" To : export COMMANDLINE_ARGS="--skip-torch-cuda-test --upcast-sampling --no-half-vae --no-half --use-cpu interrogate"

Thanks, It worked. There is Another solution without change any files, That is use the Start command line: ./webui.sh --precision full --no-half Also worked for me.

Thank you so much! It worked.

befriend1314 commented 1 year ago

./webui.sh --no-half

我这样解决了

matto80 commented 1 year ago

thank you!!

adevart commented 1 year ago

edit “stable-diffusion-webui/webui-user.sh”

find COMMANDLINE_ARGS variable

Use the following configuration： export COMMANDLINE_ARGS="--skip-torch-cuda-test --upcast-sampling --opt-split-attention --lowvram --no-half --use-cpu all"

Thanks, this worked for me to fix the error. I didn't need the part --use-cpu all. This runs on the CPU and is slower than the GPU. It's not as slow on the CPU as I thought it would be though and it only seems to use half of my CPU.

Time for the same 512x512 image was 1:20 with CPU and 0:20 with M1 Max GPU. This is roughly the same time as my Nvidia 3060 mobile GPU. The Nvidia GPU has the fan running loud but the M1 Max barely gets warm. Even after hundreds of images, there was no fan noise at all. This is great that I can use Stable Diffusion on the Mac and models like OpenJourney are working the same as they do on Windows.

mclark4386 commented 1 year ago

--opt-split-attention --lowvram --no-half --use-cpu all

What were your command line args, if you don't mind?

adevart commented 1 year ago

--opt-split-attention --lowvram --no-half --use-cpu all

What were your command line args, if you don't mind?

I used:

export COMMANDLINE_ARGS="--skip-torch-cuda-test --upcast-sampling --opt-split-attention --lowvram --no-half"

NateLevin1 commented 1 year ago

This is the correct solution, according to how you are 'supposed' to do it:

Open webui-user.sh.

Replace the line that says:

#export COMMANDLINE_ARGS=""

With:

export COMMANDLINE_ARGS="$COMMANDLINE_ARGS --no-half"

Then everything works great!

maxwwc commented 1 year ago

edit “stable-diffusion-webui/webui-user.sh”

find COMMANDLINE_ARGS variable

Use the following configuration： export COMMANDLINE_ARGS="--skip-torch-cuda-test --upcast-sampling --opt-split-attention --lowvram --no-half --use-cpu all"

you help a lot. Thanks.

MacOS 12.6.6, Intel.

AndiDog commented 9 months ago

I had NansException and other errors on M1 Pro 32 GB RAM, but upgrading some packages (in virtual environment: pip install --upgrade torch torchvision transformers) helped. That is, use newer versions than installed by webui. SD XL 1.0 base + refiner work that way (txt2img and img2img refinement), while I only got errors before.

AUTOMATIC1111 / stable-diffusion-webui