CUDA out of memory - Githubissues

Darkweasam commented 9 months ago

Describe the problem A clear and concise description of what the bug is. I get a crash whenever I try to generate something with 6GB GPU and 16GB RAM. Why is that? Nvidia GeForce GTX 1060 6GB

Full Console Log Paste full console log here. You will make our job easier if you give a full log.

X:\AI\Fooocus_win64_2-1-791>.\python_embeded\python.exe -s Fooocus\entry_with_update.py --preset realistic Already up-to-date Update succeeded. [System ARGV] ['Fooocus\entry_with_update.py', '--preset', 'realistic'] Loaded preset: X:\AI\Fooocus_win64_2-1-791\Fooocus\presets\realistic.json Python 3.10.9 (tags/v3.10.9:1dd9be6, Dec 6 2022, 20:01:21) [MSC v.1934 64 bit (AMD64)] Fooocus version: 2.1.824 Running on local URL: http://127.0.0.1:7865

To create a public link, set share=True in launch(). Total VRAM 6144 MB, total RAM 16326 MB Set vram state to: NORMAL_VRAM Disabling smart memory management Device: cuda:0 NVIDIA GeForce GTX 1060 6GB : native VAE dtype: torch.float32 Using pytorch cross attention Refiner unloaded. model_type EPS adm 2816 Using pytorch attention in VAE Working with z of shape (1, 4, 32, 32) = 4096 dimensions. Using pytorch attention in VAE extra keys {'cond_stage_model.clip_l.logit_scale', 'cond_stage_model.clip_g.transformer.text_model.embeddings.position_ids', 'cond_stage_model.clip_l.text_projection'} Base model loaded: X:\AI\Fooocus_win64_2-1-791\Fooocus\models\checkpoints\realisticStockPhoto_v10.safetensors Request to load LoRAs [['SDXL_FILM_PHOTOGRAPHY_STYLE_BetaV0.4.safetensors', 0.25], ['None', 1.0], ['None', 1.0], ['None', 1.0], ['None', 1.0]] for model [X:\AI\Fooocus_win64_2-1-791\Fooocus\models\checkpoints\realisticStockPhoto_v10.safetensors]. Loaded LoRA [X:\AI\Fooocus_win64_2-1-791\Fooocus\models\loras\SDXL_FILM_PHOTOGRAPHY_STYLE_BetaV0.4.safetensors] for UNet [X:\AI\Fooocus_win64_2-1-791\Fooocus\models\checkpoints\realisticStockPhoto_v10.safetensors] with 788 keys at weight 0.25. Loaded LoRA [X:\AI\Fooocus_win64_2-1-791\Fooocus\models\loras\SDXL_FILM_PHOTOGRAPHY_STYLE_BetaV0.4.safetensors] for CLIP [X:\AI\Fooocus_win64_2-1-791\Fooocus\models\checkpoints\realisticStockPhoto_v10.safetensors] with 264 keys at weight 0.25. Fooocus V2 Expansion: Vocab with 642 words. Fooocus Expansion engine loaded for cuda:0, use_fp16 = False. Requested to load SDXLClipModel Requested to load GPT2LMHeadModel Loading 2 new models [Fooocus Model Management] Moving model(s) has taken 1.40 seconds App started successful. Use the app with http://127.0.0.1:7865/ or 127.0.0.1:7865 [Parameters] Adaptive CFG = 7 [Parameters] Sharpness = 2 [Parameters] ADM Scale = 1.5 : 0.8 : 0.3 [Parameters] CFG = 3.0 [Parameters] Seed = 3598152127187655813 [Fooocus] Downloading control models ... [Fooocus] Loading control models ... extra keys clip vision: ['vision_model.embeddings.position_ids'] [Parameters] Sampler = dpmpp_2m_sde_gpu - karras [Parameters] Steps = 30 - 15 [Fooocus] Initializing ... [Fooocus] Loading models ... Refiner unloaded. [Fooocus] Processing prompts ... [Fooocus] Preparing Fooocus text #1 ... [Prompt Expansion] dog, intricate, elegant, highly detailed, extremely beautiful,, symmetry, sharp focus, inspired, charismatic, very coherent, cute, innocent, fine detail, full color, cinematic, winning, artistic, smart, joyful, attractive, pretty, illuminated, colorful, light, cozy, novel, epic, dramatic ambient background, determined, focused, quality, atmosphere [Fooocus] Preparing Fooocus text #2 ... [Prompt Expansion] dog, intricate, elegant, highly detailed, sharp focus, candid, sublime, dramatic, thought, cinematic, new classic, best, attractive, unique, beautiful, creative, positive, cute, smart, agile, passionate, cheerful, pretty, inspired, color, spread light, magic, cool, friendly, extremely detail, lovely, amazing, flowing, complex [Fooocus] Encoding positive #1 ... [Fooocus Model Management] Moving model(s) has taken 0.11 seconds [Fooocus] Encoding positive #2 ... [Fooocus] Encoding negative #1 ... [Fooocus] Encoding negative #2 ... [Fooocus] Image processing ... Detected 1 faces Requested to load CLIPVisionModelWithProjection Loading 1 new model [Fooocus Model Management] Moving model(s) has taken 0.50 seconds Requested to load Resampler Loading 1 new model [Fooocus Model Management] Moving model(s) has taken 0.46 seconds Requested to load To_KV Loading 1 new model [Fooocus Model Management] Moving model(s) has taken 0.32 seconds [Parameters] Denoising Strength = 1.0 [Parameters] Initial Latent shape: Image Space (1152, 896) Preparation time: 14.29 seconds [Sampler] refiner_swap_method = joint [Sampler] sigma_min = 0.0291671771556139, sigma_max = 14.614643096923828 Requested to load SDXL Loading 1 new model loading in lowvram mode 3194.4163160324097 Traceback (most recent call last): File "X:\AI\Fooocus_win64_2-1-791\Fooocus\modules\async_worker.py", line 803, in worker handler(task) File "X:\AI\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context return func(*args, kwargs) File "X:\AI\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context return func(*args, *kwargs) File "X:\AI\Fooocus_win64_2-1-791\Fooocus\modules\async_worker.py", line 735, in handler imgs = pipeline.process_diffusion( File "X:\AI\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context return func(args, kwargs) File "X:\AI\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context return func(*args, kwargs) File "X:\AI\Fooocus_win64_2-1-791\Fooocus\modules\default_pipeline.py", line 361, in process_diffusion sampled_latent = core.ksampler( File "X:\AI\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context return func(*args, *kwargs) File "X:\AI\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context return func(args, kwargs) File "X:\AI\Fooocus_win64_2-1-791\Fooocus\modules\core.py", line 315, in ksampler samples = fcbh.sample.sample(model, noise, steps, cfg, sampler_name, scheduler, positive, negative, latent_image, File "X:\AI\Fooocus_win64_2-1-791\Fooocus\backend\headless\fcbh\sample.py", line 93, in sample real_model, positive_copy, negative_copy, noise_mask, models = prepare_sampling(model, noise.shape, positive, negative, noise_mask) File "X:\AI\Fooocus_win64_2-1-791\Fooocus\backend\headless\fcbh\sample.py", line 86, in prepare_sampling fcbh.model_management.load_models_gpu([model] + models, model.memory_required(noise_shape) + inference_memory) File "X:\AI\Fooocus_win64_2-1-791\Fooocus\modules\patch.py", line 494, in patched_load_models_gpu y = fcbh.model_management.load_models_gpu_origin(*args, kwargs) File "X:\AI\Fooocus_win64_2-1-791\Fooocus\backend\headless\fcbh\model_management.py", line 410, in load_models_gpu cur_loaded_model = loaded_model.model_load(lowvram_model_memory) File "X:\AI\Fooocus_win64_2-1-791\Fooocus\backend\headless\fcbh\model_management.py", line 298, in model_load accelerate.dispatch_model(self.real_model, device_map=device_map, main_device=self.device) File "X:\AI\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\accelerate\big_modeling.py", line 371, in dispatch_model attach_align_device_hook_on_blocks( File "X:\AI\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\accelerate\hooks.py", line 536, in attach_align_device_hook_on_blocks attach_align_device_hook_on_blocks( File "X:\AI\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\accelerate\hooks.py", line 536, in attach_align_device_hook_on_blocks attach_align_device_hook_on_blocks( File "X:\AI\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\accelerate\hooks.py", line 506, in attach_align_device_hook_on_blocks add_hook_to_module(module, hook) File "X:\AI\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\accelerate\hooks.py", line 155, in add_hook_to_module module = hook.init_hook(module) File "X:\AI\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\accelerate\hooks.py", line 253, in init_hook set_module_tensor_to_device(module, name, self.execution_device) File "X:\AI\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\accelerate\utils\modeling.py", line 292, in set_module_tensor_to_device new_value = old_value.to(device) torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 30.00 MiB. GPU 0 has a total capacty of 6.00 GiB of which 4.15 GiB is free. Of the allocated memory 881.50 MiB is allocated by PyTorch, and 54.50 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF Requested to load SDXLClipModel Requested to load GPT2LMHeadModel Loading 2 new models Total time: 55.23 seconds ERROR clip_g.transformer.text_model.encoder.layers.0.mlp.fc1.weight CUDA out of memory. Tried to allocate 26.00 MiB. GPU 0 has a total capacty of 6.00 GiB of which 4.10 GiB is free. Of the allocated memory 938.78 MiB is allocated by PyTorch, and 49.22 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF Exception in thread Thread-2 (worker): Traceback (most recent call last): File "threading.py", line 1016, in _bootstrap_inner File "threading.py", line 953, in run File "X:\AI\Fooocus_win64_2-1-791\Fooocus\modules\async_worker.py", line 809, in worker pipeline.prepare_text_encoder(async_call=True) File "X:\AI\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context return func(*args, *kwargs) File "X:\AI\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context return func(args, kwargs) File "X:\AI\Fooocus_win64_2-1-791\Fooocus\modules\default_pipeline.py", line 211, in prepare_text_encoder fcbh.model_management.load_models_gpu([final_clip.patcher, final_expansion.patcher]) File "X:\AI\Fooocus_win64_2-1-791\Fooocus\modules\patch.py", line 494, in patched_load_models_gpu y = fcbh.model_management.load_models_gpu_origin(*args, **kwargs) File "X:\AI\Fooocus_win64_2-1-791\Fooocus\backend\headless\fcbh\model_management.py", line 410, in load_models_gpu cur_loaded_model = loaded_model.model_load(lowvram_model_memory) File "X:\AI\Fooocus_win64_2-1-791\Fooocus\backend\headless\fcbh\model_management.py", line 293, in model_load raise e File "X:\AI\Fooocus_win64_2-1-791\Fooocus\backend\headless\fcbh\model_management.py", line 289, in model_load self.real_model = self.model.patch_model(device_to=patch_model_to) #TODO: do something with loras and offloading to CPU File "X:\AI\Fooocus_win64_2-1-791\Fooocus\backend\headless\fcbh\model_patcher.py", line 191, in patch_model temp_weight = fcbh.model_management.cast_to_device(weight, device_to, torch.float32, copy=True) File "X:\AI\Fooocus_win64_2-1-791\Fooocus\backend\headless\fcbh\model_management.py", line 532, in cast_to_device return tensor.to(device, copy=copy).to(dtype) torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 26.00 MiB. GPU 0 has a total capacty of 6.00 GiB of which 4.10 GiB is free. Of the allocated memory 915.50 MiB is allocated by PyTorch, and 72.50 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

acvcleitao commented 9 months ago

+1

mashb1t commented 8 months ago

@lllyasviel could you find out if NVIDIA GeForce GTX 1060 or generally speaking any 10XX card with 6GB is possible to use or if it's a bug? (https://github.com/lllyasviel/Fooocus/blob/main/troubleshoot.md#i-am-using-nvidia-with-6gb-vram-i-get-cuda-out-of-memory)

mashb1t commented 6 months ago

Closing as stale, feel free to provide more information to reopen.

Darkweasam commented 6 months ago

Closing as stale, feel free to provide more information to reopen.

If you are asking me, I am unsure what other information could I provide to help fix this...

lllyasviel / Fooocus

CUDA out of memory #1114