[bug]: Cannot convert a MPS Tensor to float64 dtype as the MPS framework doesn't support float64. Please use float32 instead.

peterdandreti commented 1 year ago

Is there an existing issue for this?

[X] I have searched the existing issues

OS

macOS

GPU

mps

VRAM

15.0

What version did you experience this issue on?

3.2.0rc3

What happened?

TypeError when using the Euler Karras Scheduler (sd 1x or XL) producing a red text error.

TypeError: Cannot convert a MPS Tensor to float64 dtype as the MPS framework doesn't support float64. Please use float32 instead.

[2023-10-09 10:15:12,303]::[InvokeAI]::ERROR --> Error while invoking: Cannot convert a MPS Tensor to float64 dtype as the MPS framework doesn't support float64. Please use float32 instead.

Screenshots

No response

Additional context

No response

Contact Details

No response

psychedelicious commented 1 year ago

@gogurtenjoyer @Millu Could y'all give Euler Karras a spin on M1?

Vargol commented 1 year ago

I get the same on my M1

[2023-10-09 18:24:42,253]::[InvokeAI]::ERROR --> Traceback (most recent call last):
  File "/Volumes/Sabrent Media/Documents/Source/Python/InvokeAI3.2/InvokeAI/invokeai/app/services/processor.py", line 106, in __process
    outputs = invocation.invoke_internal(
  File "/Volumes/Sabrent Media/Documents/Source/Python/InvokeAI3.2/InvokeAI/invokeai/app/invocations/baseinvocation.py", line 609, in invoke_internal
    output = self.invoke(context)
  File "/Volumes/Sabrent Media/Documents/Source/Python/InvokeAI3.2/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/Volumes/Sabrent Media/Documents/Source/Python/InvokeAI3.2/InvokeAI/invokeai/app/invocations/latent.py", line 587, in invoke
    num_inference_steps, timesteps, init_timestep = self.init_scheduler(
  File "/Volumes/Sabrent Media/Documents/Source/Python/InvokeAI3.2/InvokeAI/invokeai/app/invocations/latent.py", line 461, in init_scheduler
    scheduler.set_timesteps(steps, device=device)
  File "/Volumes/Sabrent Media/Documents/Source/Python/InvokeAI3.2/lib/python3.10/site-packages/diffusers/schedulers/scheduling_euler_discrete.py", line 276, in set_timesteps
    self.timesteps = torch.from_numpy(timesteps).to(device=device)
TypeError: Cannot convert a MPS Tensor to float64 dtype as the MPS framework doesn't support float64. Please use float32 instead.

[2023-10-09 18:24:42,283]::[InvokeAI]::ERROR --> Error while invoking:
Cannot convert a MPS Tensor to float64 dtype as the MPS framework doesn't support float64. Please use float32 instead

psychedelicious commented 1 year ago

Thanks @Vargol

I'm guessing this is a regression due to a recent torch update but I'm not sure how to proceed. Any ideas?

@StAlKeR7779

gogurtenjoyer commented 1 year ago

Can confirm - same error as above. Also, I'm using a pytorch 2.2.0 nightly so I guess it's not getting fixed any time soon.

Vargol commented 1 year ago

I don't think Metal supports 64 floating point values at all , more of a Diffusers level fix.

Does timestamps really need for be float64, can they stick as as_type(np.float32) in.

Original code is....

        sigmas = np.concatenate([sigmas, [0.0]]).astype(np.float32)
        self.sigmas = torch.from_numpy(sigmas).to(device=device)

        self.timesteps = torch.from_numpy(timesteps).to(device=device)
        self._step_index = None

maybe something like...

        sigmas = np.concatenate([sigmas, [0.0]]).astype(np.float32)
        self.sigmas = torch.from_numpy(sigmas).to(device=device)

        self.timesteps = torch.from_numpy(timesteps.astype(np.float32)).to(device=device)
        self._step_index = None

or stick guards around if for so it only does it if device == 'mps'

peterdandreti commented 1 year ago

Just did a few further checks: the same issue exists with:

Heun Karras
LMS Karras.

All other schedulers didn't produce this error.

RaphaelDreams commented 1 year ago

I have this issue as well now after updating to 3.3,

2023-10-13 12:48:08,016]::[InvokeAI]::ERROR --> Traceback (most recent call last): File "/media/Raphael/Storage/InvokeAI_WebUI/Latest/Latest/.venv/lib/python3.11/site-packages/invokeai/app/services/processor.py", line 106, in __process outputs = invocation.invoke_internal( ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/media/Raphael/Storage/InvokeAI_WebUI/Latest/Latest/.venv/lib/python3.11/site-packages/invokeai/app/invocations/baseinvocation.py", line 610, in invoke_internal output = self.invoke(context) ^^^^^^^^^^^^^^^^^^^^ File "/media/Raphael/Storage/InvokeAI_WebUI/Latest/Latest/.venv/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/media/Raphael/Storage/InvokeAI_WebUI/Latest/Latest/.venv/lib/python3.11/site-packages/invokeai/app/invocations/compel.py", line 289, in invoke c1, c1_pooled, ec1 = self.run_clip_compel( ^^^^^^^^^^^^^^^^^^^^^ File "/media/Raphael/Storage/InvokeAI_WebUI/Latest/Latest/.venv/lib/python3.11/site-packages/invokeai/app/invocations/compel.py", line 162, in run_clip_compel tokenizer_info = context.services.model_manager.get_model( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/media/Raphael/Storage/InvokeAI_WebUI/Latest/Latest/.venv/lib/python3.11/site-packages/invokeai/app/services/model_manager_service.py", line 374, in get_model model_info = self.mgr.get_model( ^^^^^^^^^^^^^^^^^^^ File "/media/Raphael/Storage/InvokeAI_WebUI/Latest/Latest/.venv/lib/python3.11/site-packages/invokeai/backend/model_management/model_manager.py", line 494, in get_model model_context = self.cache.get_model( ^^^^^^^^^^^^^^^^^^^^^ File "/media/Raphael/Storage/InvokeAI_WebUI/Latest/Latest/.venv/lib/python3.11/site-packages/invokeai/backend/model_management/model_cache.py", line 226, in get_model snapshot_before = MemorySnapshot.capture() ^^^^^^^^^^^^^^^^^^^^^^^^ File "/media/Raphael/Storage/InvokeAI_WebUI/Latest/Latest/.venv/lib/python3.11/site-packages/invokeai/backend/model_management/memory_snapshot.py", line 57, in capture malloc_info = LibcUtil().mallinfo2() ^^^^^^^^^^^^^^^^^^^^^^ File "/media/Raphael/Storage/InvokeAI_WebUI/Latest/Latest/.venv/lib/python3.11/site-packages/invokeai/backend/model_management/libc_util.py", line 73, in mallinfo2 mallinfo2 = self._libc.mallinfo2 ^^^^^^^^^^^^^^^^^^^^ File "/home/Raphael/.pyenv/versions/3.11.0/lib/python3.11/ctypes/init.py", line 389, in getattr func = self.getitem(name) ^^^^^^^^^^^^^^^^^^^^^^ File "/home/Raphael/.pyenv/versions/3.11.0/lib/python3.11/ctypes/init.py", line 394, in getitem func = self._FuncPtr((name_or_ordinal, self)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ AttributeError: /lib/x86_64-linux-gnu/libc.so.6: undefined symbol: mallinfo2

[2023-10-13 12:48:08,089]::[InvokeAI]::ERROR --> Error while invoking: /lib/x86_64-linux-gnu/libc.so.6: undefined symbol: mallinfo2

psychedelicious commented 1 year ago

@RaphaelDreams this is a different issue. Can you please create a separate GitHub issue for it? Please include your OS/platform details

Edit: nevermind, made the issue for ya #4878

Adreitz commented 1 year ago

Just did a few further checks: the same issue exists with:

Heun Karras

LMS Karras.

All other schedulers didn't produce this error.

Confirmed. Are you also seeing a different error, RuntimeError: a Tensor with 2 elements cannot be converted to Scalar on DPM++ 2S Karras?

peterdandreti commented 1 year ago

@Adreitz - For DPM++ 2S Karras, I don't have any errors with this scheduler (M1 Max with 15GB configured for Invoke.)

Adreitz commented 1 year ago

@peterdandreti Very strange. I downgraded to torch 2.0.1 and I'm still seeing it. Are you running Sonoma or Ventura? Or perhaps it is specific to SDXL?

I'm also seeing a number of other issues, such as DPM++2M SDE producing very noisy output, various other schedulers producing output with slight residual noise (sometimes only noticeable as slight chroma blotching in shadows or coarse luma noise in gradients) and a persistent nondeterminism on my 2-stage SDXL upscaling workflow (but only on the high-res output!). I can't tell if something is wrong with my setup or others just haven't noticed.

peterdandreti commented 1 year ago

@Adreitz I've just rerun some new tests on post2. I couldn't reproduce any red text errors with 1x or SDXL **edit for DPM++ 2S Karras

My system and versions are: M1 Max, Sonoma 14.0 (23A344) Python 3.10.9 Invoke 3.3.0 post2, default script install (i.e. I didn't install PyTorch or run any secondary scripts.)

Side note: fp16 VAE Precision usually renders the final frame in black for SDXL, regardless of the scheduler.

Vargol commented 1 year ago

Yes the vae's not working at fp16 is a known issue with the vaes the come with SDXL,
you need to use madebyollin's vae to use it at fp16

Adreitz commented 1 year ago

@peterdandreti I'm similar but M2 Max with 64GB and Python 3.10.13. I had the pytorch(vision) nightlies installed with the hope that they might have some sort of benefit for Sonoma, not that I've seen any improvements. I rolled them back again and the performance and bugs seem the same.

You'll need to use madebyollin's SDXL VAE to use fp16 for the VAE. I straight up replaced the built-in VAE.

Vargol commented 1 year ago

The are improvements in the pytorch nighties, it fixes a few MPS fp16 bugs in pytorch that affect SD. You won't see the benefit with InvokeAI though as it had to work around the bugs and the work arounds are still in place.

peterdandreti commented 9 months ago

Just an update - this issue has been resolved for all of the mentioned schedulers.

peterdandreti commented 9 months ago

Closed this one too quickly - problem still exists for Heun Karras and LMS Karras

invoke-ai / InvokeAI