lllyasviel / Fooocus

Focus on prompting and generating
GNU General Public License v3.0
40k stars 5.52k forks source link

torch.cuda.OutOfMemoryError: CUDA out of memory. #891

Closed Azaki9 closed 8 months ago

Azaki9 commented 10 months ago

I downloaded all the files for Fooocus but whenever I try to input an image for a variation or image prompt, I get this error:

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 30.00 MiB. GPU 0 has a total capacty of 6.00 GiB of which 1.57 GiB is free. Of the allocated memory 3.23 GiB is allocated by PyTorch, and 178.84 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Here's the full log for reference:

D:\Software\AI\Fooocus>.\python_embeded\python.exe -s Fooocus\entry_with_update.py
Already up-to-date
Update succeeded.
[System ARGV] ['Fooocus\\entry_with_update.py']
Python 3.10.9 (tags/v3.10.9:1dd9be6, Dec  6 2022, 20:01:21) [MSC v.1934 64 bit (AMD64)]
Fooocus version: 2.1.780
Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.
Total VRAM 6144 MB, total RAM 14188 MB
Set vram state to: NORMAL_VRAM
Disabling smart memory management
Device: cuda:0 NVIDIA GeForce RTX 3060 Laptop GPU : native
VAE dtype: torch.bfloat16
Using pytorch cross attention
model_type EPS
adm 2560
Using pytorch attention in VAE
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
Using pytorch attention in VAE
missing {'cond_stage_model.clip_g.transformer.text_model.embeddings.position_ids'}
Refiner model loaded: D:\Software\AI\Fooocus\Fooocus\models\checkpoints\sd_xl_refiner_1.0_0.9vae.safetensors
model_type EPS
adm 2816
Using pytorch attention in VAE
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
Using pytorch attention in VAE
missing {'cond_stage_model.clip_l.logit_scale', 'cond_stage_model.clip_l.text_projection', 'cond_stage_model.clip_g.transformer.text_model.embeddings.position_ids'}
Base model loaded: D:\Software\AI\Fooocus\Fooocus\models\checkpoints\sd_xl_base_1.0_0.9vae.safetensors
LoRAs loaded: [('sd_xl_offset_example-lora_1.0.safetensors', 0.5), ('None', 0.5), ('None', 0.5), ('None', 0.5), ('None', 0.5)]
Fooocus V2 Expansion: Vocab with 642 words.
Fooocus Expansion engine loaded for cuda:0, use_fp16 = True.
Requested to load SDXLClipModel
Requested to load GPT2LMHeadModel
Loading 2 new models
[Fooocus Model Management] Moving model(s) has taken 0.88 seconds
App started successful. Use the app with http://127.0.0.1:7860/ or 127.0.0.1:7860
[Parameters] Adaptive CFG = 7
[Parameters] Sharpness = 2
[Parameters] ADM Scale = 1.5 : 0.8 : 0.3
[Parameters] CFG = 7.0
[Parameters] Seed = 825443032479212006
[Parameters] Sampler = dpmpp_2m_sde_gpu - karras
[Parameters] Steps = 30 - 24
[Fooocus] Initializing ...
[Fooocus] Loading models ...
[Fooocus] Processing prompts ...
[Fooocus] Preparing Fooocus text #1 ...
[Prompt Expansion] nature background wedding park, intricate, elegant, highly detailed, colorful, warm light, sharp focus, symmetry, epic composition, joyful, historic, holy, dramatic, scenic, best, contemporary, dynamic, full color, coherent, creative, calm, beautiful, cute, open, lovely, pretty, friendly, rare, radiant, magical, passionate, magic, atmosphere, ambient
[Fooocus] Encoding positive #1 ...
[Fooocus Model Management] Moving model(s) has taken 0.33 seconds
[Fooocus] Encoding negative #1 ...
[Fooocus] Image processing ...
[Fooocus] VAE encoding ...
Final resolution is (1432, 1080).
Preparation time: 5.89 seconds
[Sampler] refiner_swap_method = joint
[Sampler] sigma_min = 0.02916753850877285, sigma_max = 1.2123907804489136
Requested to load SDXL
Loading 1 new model
loading in lowvram mode 3118.8976106643677
[Fooocus Model Management] Moving model(s) has taken 16.01 seconds
  0%|                                                                                           | 0/30 [00:02<?, ?it/s]
Traceback (most recent call last):
  File "D:\Software\AI\Fooocus\Fooocus\modules\async_worker.py", line 645, in worker
    handler(task)
  File "D:\Software\AI\Fooocus\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "D:\Software\AI\Fooocus\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "D:\Software\AI\Fooocus\Fooocus\modules\async_worker.py", line 581, in handler
    imgs = pipeline.process_diffusion(
  File "D:\Software\AI\Fooocus\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "D:\Software\AI\Fooocus\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "D:\Software\AI\Fooocus\Fooocus\modules\default_pipeline.py", line 351, in process_diffusion
    sampled_latent = core.ksampler(
  File "D:\Software\AI\Fooocus\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "D:\Software\AI\Fooocus\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "D:\Software\AI\Fooocus\Fooocus\modules\core.py", line 263, in ksampler
    samples = fcbh.sample.sample(model, noise, steps, cfg, sampler_name, scheduler, positive, negative, latent_image,
  File "D:\Software\AI\Fooocus\Fooocus\backend\headless\fcbh\sample.py", line 100, in sample
    samples = sampler.sample(noise, positive_copy, negative_copy, cfg=cfg, latent_image=latent_image, start_step=start_step, last_step=last_step, force_full_denoise=force_full_denoise, denoise_mask=noise_mask, sigmas=sigmas, callback=callback, disable_pbar=disable_pbar, seed=seed)
  File "D:\Software\AI\Fooocus\Fooocus\backend\headless\fcbh\samplers.py", line 728, in sample
    return sample(self.model, noise, positive, negative, cfg, self.device, sampler(), sigmas, self.model_options, latent_image=latent_image, denoise_mask=denoise_mask, callback=callback, disable_pbar=disable_pbar, seed=seed)
  File "D:\Software\AI\Fooocus\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "D:\Software\AI\Fooocus\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "D:\Software\AI\Fooocus\Fooocus\modules\sample_hijack.py", line 152, in sample_hacked
    samples = sampler.sample(model_wrap, sigmas, extra_args, callback_wrap, noise, latent_image, denoise_mask, disable_pbar)
  File "D:\Software\AI\Fooocus\Fooocus\backend\headless\fcbh\samplers.py", line 589, in sample
    samples = getattr(k_diffusion_sampling, "sample_{}".format(sampler_name))(model_k, noise, sigmas, extra_args=extra_args, callback=k_callback, disable=disable_pbar, **extra_options)
  File "D:\Software\AI\Fooocus\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "D:\Software\AI\Fooocus\Fooocus\backend\headless\fcbh\k_diffusion\sampling.py", line 701, in sample_dpmpp_2m_sde_gpu
    return sample_dpmpp_2m_sde(model, x, sigmas, extra_args=extra_args, callback=callback, disable=disable, eta=eta, s_noise=s_noise, noise_sampler=noise_sampler, solver_type=solver_type)
  File "D:\Software\AI\Fooocus\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "D:\Software\AI\Fooocus\Fooocus\backend\headless\fcbh\k_diffusion\sampling.py", line 613, in sample_dpmpp_2m_sde
    denoised = model(x, sigmas[i] * s_in, **extra_args)
  File "D:\Software\AI\Fooocus\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "D:\Software\AI\Fooocus\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "D:\Software\AI\Fooocus\Fooocus\modules\patch.py", line 306, in patched_KSamplerX0Inpaint_forward
    out = self.inner_model(x, sigma,
  File "D:\Software\AI\Fooocus\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "D:\Software\AI\Fooocus\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "D:\Software\AI\Fooocus\Fooocus\modules\patch.py", line 201, in patched_discrete_eps_ddpm_denoiser_forward
    eps = self.get_eps(input * c_in, self.sigma_to_t(sigma), **kwargs)
  File "D:\Software\AI\Fooocus\Fooocus\backend\headless\fcbh\k_diffusion\external.py", line 155, in get_eps
    return self.inner_model.apply_model(*args, **kwargs)
  File "D:\Software\AI\Fooocus\Fooocus\backend\headless\fcbh\samplers.py", line 275, in apply_model
    out = sampling_function(self.inner_model.apply_model, x, timestep, uncond, cond, cond_scale, model_options=model_options, seed=seed)
  File "D:\Software\AI\Fooocus\Fooocus\backend\headless\fcbh\samplers.py", line 253, in sampling_function
    cond, uncond = calc_cond_uncond_batch(model_function, cond, uncond, x, timestep, max_total_area, model_options)
  File "D:\Software\AI\Fooocus\Fooocus\backend\headless\fcbh\samplers.py", line 227, in calc_cond_uncond_batch
    output = model_options['model_function_wrapper'](model_function, {"input": input_x, "timestep": timestep_, "c": c, "cond_or_uncond": cond_or_uncond}).chunk(batch_chunks)
  File "D:\Software\AI\Fooocus\Fooocus\modules\patch.py", line 211, in patched_model_function_wrapper
    return func(x, t, **c)
  File "D:\Software\AI\Fooocus\Fooocus\backend\headless\fcbh\model_base.py", line 66, in apply_model
    return self.diffusion_model(xc, t, context=context, control=control, transformer_options=transformer_options, **extra_conds).float()
  File "D:\Software\AI\Fooocus\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "D:\Software\AI\Fooocus\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "D:\Software\AI\Fooocus\Fooocus\modules\patch.py", line 407, in patched_unet_forward
    h = forward_timestep_embed(module, h, emb, context, transformer_options)
  File "D:\Software\AI\Fooocus\Fooocus\backend\headless\fcbh\ldm\modules\diffusionmodules\openaimodel.py", line 56, in forward_timestep_embed
    x = layer(x, context, transformer_options)
  File "D:\Software\AI\Fooocus\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "D:\Software\AI\Fooocus\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "D:\Software\AI\Fooocus\python_embeded\lib\site-packages\accelerate\hooks.py", line 165, in new_forward
    output = old_forward(*args, **kwargs)
  File "D:\Software\AI\Fooocus\Fooocus\backend\headless\fcbh\ldm\modules\attention.py", line 560, in forward
    x = block(x, context=context[i], transformer_options=transformer_options)
  File "D:\Software\AI\Fooocus\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "D:\Software\AI\Fooocus\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "D:\Software\AI\Fooocus\python_embeded\lib\site-packages\accelerate\hooks.py", line 165, in new_forward
    output = old_forward(*args, **kwargs)
  File "D:\Software\AI\Fooocus\Fooocus\backend\headless\fcbh\ldm\modules\attention.py", line 390, in forward
    return checkpoint(self._forward, (x, context, transformer_options), self.parameters(), self.checkpoint)
  File "D:\Software\AI\Fooocus\Fooocus\backend\headless\fcbh\ldm\modules\diffusionmodules\util.py", line 123, in checkpoint
    return func(*inputs)
  File "D:\Software\AI\Fooocus\Fooocus\backend\headless\fcbh\ldm\modules\attention.py", line 500, in _forward
    x = self.ff(self.norm3(x)) + x
  File "D:\Software\AI\Fooocus\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "D:\Software\AI\Fooocus\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "D:\Software\AI\Fooocus\python_embeded\lib\site-packages\accelerate\hooks.py", line 165, in new_forward
    output = old_forward(*args, **kwargs)
  File "D:\Software\AI\Fooocus\Fooocus\backend\headless\fcbh\ldm\modules\attention.py", line 82, in forward
    return self.net(x)
  File "D:\Software\AI\Fooocus\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "D:\Software\AI\Fooocus\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "D:\Software\AI\Fooocus\python_embeded\lib\site-packages\accelerate\hooks.py", line 165, in new_forward
    output = old_forward(*args, **kwargs)
  File "D:\Software\AI\Fooocus\python_embeded\lib\site-packages\torch\nn\modules\container.py", line 215, in forward
    input = module(input)
  File "D:\Software\AI\Fooocus\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "D:\Software\AI\Fooocus\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "D:\Software\AI\Fooocus\python_embeded\lib\site-packages\accelerate\hooks.py", line 165, in new_forward
    output = old_forward(*args, **kwargs)
  File "D:\Software\AI\Fooocus\Fooocus\backend\headless\fcbh\ldm\modules\attention.py", line 62, in forward
    return x * F.gelu(gate)
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 30.00 MiB. GPU 0 has a total capacty of 6.00 GiB of which 1.57 GiB is free. Of the allocated memory 3.23 GiB is allocated by PyTorch, and 178.84 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
Requested to load SDXLClipModel
Requested to load GPT2LMHeadModel
Loading 2 new models
Total time: 25.02 seconds
[Fooocus Model Management] Moving model(s) has taken 3.26 seconds

Side note: I'm using a Lenovo Legion 5 pro laptop (16ACH16) with 16gb of ram and 3060RTX and i noticed that my memory usage goes crazy at the step (moving to GPU) like 99.9999% of ram is used

I tried to search for a solution on how to change this max_split_size_mb but couldn't find anything useful.

excuse my lack of knowledge i barely know how to do HTML

scsp85 commented 9 months ago

I wonder if it’s related to the driver update offloading to ram. You can adjust a setting now to improve, and keep it in VRAM.

Direct Link: https://nvidia.custhelp.com/app/answers/detail/a_id/5490

https://www.reddit.com/r/StableDiffusion/comments/17km6v0/new_nvidia_driver_makes_offloading_to_ram_optional/

mashb1t commented 8 months ago

Please find the steps in the troubleshooting guide here: https://github.com/lllyasviel/Fooocus/blob/8e62a72a63b30a3067d1a1bc3f8d226824bd9283/troubleshoot.md#i-am-using-nvidia-with-6gb-vram-i-get-cuda-out-of-memory