invoke-ai / InvokeAI

Invoke is a leading creative engine for Stable Diffusion models, empowering professionals, artists, and enthusiasts to generate and create visual media using the latest AI-driven technologies. The solution offers an industry leading WebUI, and serves as the foundation for multiple commercial products.
https://invoke-ai.github.io/InvokeAI/
Apache License 2.0
23.8k stars 2.45k forks source link

[bug]: [5.4.1] Memory regression #7328

Open grunblatt-git opened 1 week ago

grunblatt-git commented 1 week ago

Is there an existing issue for this problem?

Operating system

macOS

GPU vendor

Apple Silicon (MPS)

GPU model

No response

GPU VRAM

No response

Version number

5.4.1rc2

Browser

Chrome

Python dependencies

No response

What happened

Image generation aborts with the following error when generating "large" images (i.e. 832 x 1248 with an SD1.5 model).

Traceback (most recent call last): File "invokeai/.venv/lib/python3.10/site-packages/invokeai/app/services/session_processor/session_processor_default.py", line 129, in run_node output = invocation.invoke_internal(context=context, services=self._services) File "invokeai/.venv/lib/python3.10/site-packages/invokeai/app/invocations/baseinvocation.py", line 298, in invoke_internal output = self.invoke(context) File "invokeai/.venv/lib/python3.10/site-packages/invokeai/app/invocations/denoise_latents.py", line 812, in invoke return self._old_invoke(context) File "invokeai/.venv/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(*args, kwargs) File ".pyenv/versions/3.10.9/lib/python3.10/contextlib.py", line 79, in inner return func(*args, *kwds) File "invokeai/.venv/lib/python3.10/site-packages/invokeai/app/invocations/denoise_latents.py", line 1073, in _old_invoke result_latents = pipeline.latents_from_embeddings( File "invokeai/.venv/lib/python3.10/site-packages/invokeai/backend/stable_diffusion/diffusers_pipeline.py", line 394, in latents_from_embeddings step_output = self.step( File "invokeai/.venv/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(args, kwargs) File "invokeai/.venv/lib/python3.10/site-packages/invokeai/backend/stable_diffusion/diffusers_pipeline.py", line 545, in step uc_noise_pred, c_noise_pred = self.invokeai_diffuser.do_unet_step( File "invokeai/.venv/lib/python3.10/site-packages/invokeai/backend/stable_diffusion/diffusion/shared_invokeai_diffusion.py", line 199, in do_unet_step ) = self._apply_standard_conditioning( File "invokeai/.venv/lib/python3.10/site-packages/invokeai/backend/stable_diffusion/diffusion/shared_invokeai_diffusion.py", line 343, in _apply_standard_conditioning both_results = self.model_forward_callback( File "invokeai/.venv/lib/python3.10/site-packages/invokeai/backend/stable_diffusion/diffusers_pipeline.py", line 608, in _unet_forward return self.unet( File "invokeai/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl return self._call_impl(*args, kwargs) File "invokeai/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl return forward_call(*args, *kwargs) File "invokeai/.venv/lib/python3.10/site-packages/diffusers/models/unets/unet_2d_condition.py", line 1216, in forward sample, res_samples = downsample_block( File "invokeai/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl return self._call_impl(args, kwargs) File "invokeai/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl return forward_call(*args, kwargs) File "invokeai/.venv/lib/python3.10/site-packages/diffusers/models/unets/unet_2d_blocks.py", line 1288, in forward hidden_states = attn( File "invokeai/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl return self._call_impl(*args, *kwargs) File "invokeai/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl return forward_call(args, kwargs) File "invokeai/.venv/lib/python3.10/site-packages/diffusers/models/transformers/transformer_2d.py", line 442, in forward hidden_states = block( File "invokeai/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl return self._call_impl(*args, kwargs) File "invokeai/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl return forward_call(*args, *kwargs) File "invokeai/.venv/lib/python3.10/site-packages/diffusers/models/attention.py", line 507, in forward attn_output = self.attn1( File "invokeai/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl return self._call_impl(args, kwargs) File "invokeai/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl return forward_call(*args, **kwargs) File "invokeai/.venv/lib/python3.10/site-packages/diffusers/models/attention_processor.py", line 495, in forward return self.processor( File "invokeai/.venv/lib/python3.10/site-packages/diffusers/models/attention_processor.py", line 2383, in call hidden_states = F.scaled_dot_product_attention( RuntimeError: Invalid buffer size: 288.00 GB

What you expected to happen

Since generating images with the same settings works perfectly up to version 5.0.2, i would expect invoke to use a similar amount of memory for generation in 5.4.1 as well.

I can't verify if this issue already exists in 5.1.x - 5.3.x since i can't install those versions

How to reproduce the problem

No response

Additional context

The amount of memory required according to the error message scales with the image resolution:

832x1248 | RuntimeError: Invalid buffer size: 7.84 GB 1024x1536 | RuntimeError: Invalid buffer size: 18.00 GB 1280x1920 | RuntimeError: Invalid buffer size: 43.95 GB 1600x2400 | RuntimeError: Invalid buffer size: 107.29 GB 2048x3072 | RuntimeError: Invalid buffer size: 288.00 GB

I initially thought this was an issue with my model cache and created a new invoke installation in a separate folder. I only imported a single model in this installation, but get the same error with the same memory requirements

I also noticed that image generation in 5.4.1 was slower (up to 7.76 it/sec) than 5.0.2 (consistent 2.65 it/sec)

Adding/ Removing LoRAs or changing the Scheduler does not seem to affect the error

Rolling back to invoke v5.0.2 resolves this issue.

Discord username

No response