Closed carnager closed 8 months ago
Can you please check if it works without setting --attention-split
(not setting any arguments)? Thanks!
yeah, tried that already, same behavior without any arguments
I have the same issue before and after creating the 40GB swap partition. It doesn't seem to be ram/memory related.
Full logs:
$ python launch.py
[System ARGV] ['launch.py']
Python 3.11.5 (main, Sep 11 2023, 13:54:46) [GCC 11.2.0]
Fooocus version: 2.1.860
Running on local URL: http://127.0.0.1:7865
To create a public link, set `share=True` in `launch()`.
amdgpu.ids: No such file or directory
amdgpu.ids: No such file or directory
Total VRAM 8176 MB, total RAM 32018 MB
Set vram state to: NORMAL_VRAM
Always offload VRAM
Device: cuda:0 AMD Radeon Graphics : native
VAE dtype: torch.float32
Using sub quadratic optimization for cross attention, if you have memory or speed issues try using: --attention-split
Refiner unloaded.
model_type EPS
UNet ADM Dimension 2816
Using split attention in VAE
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
Using split attention in VAE
extra {'cond_stage_model.clip_l.text_projection', 'cond_stage_model.clip_g.transformer.text_model.embeddings.position_ids', 'cond_stage_model.clip_l.logit_scale'}
Base model loaded: /home/codeliger/dl/Fooocus/models/checkpoints/juggernautXL_version6Rundiffusion.safetensors
Request to load LoRAs [['sd_xl_offset_example-lora_1.0.safetensors', 0.1], ['None', 1.0], ['None', 1.0], ['None', 1.0], ['None', 1.0]] for model [/home/codeliger/dl/Fooocus/models/checkpoints/juggernautXL_version6Rundiffusion.safetensors].
Loaded LoRA [/home/codeliger/dl/Fooocus/models/loras/sd_xl_offset_example-lora_1.0.safetensors] for UNet [/home/codeliger/dl/Fooocus/models/checkpoints/juggernautXL_version6Rundiffusion.safetensors] with 788 keys at weight 0.1.
Fooocus V2 Expansion: Vocab with 642 words.
Fooocus Expansion engine loaded for cuda:0, use_fp16 = True.
Requested to load SDXLClipModel
Requested to load GPT2LMHeadModel
Loading 2 new models
[Fooocus Model Management] Moving model(s) has taken 0.52 seconds
App started successful. Use the app with http://127.0.0.1:7865/ or 127.0.0.1:7865
[Parameters] Adaptive CFG = 7
[Parameters] Sharpness = 2
[Parameters] ADM Scale = 1.5 : 0.8 : 0.3
[Parameters] CFG = 4.0
[Parameters] Seed = 5323403996105043708
[Parameters] Sampler = dpmpp_2m_sde_gpu - karras
[Parameters] Steps = 30 - 15
[Fooocus] Initializing ...
[Fooocus] Loading models ...
Refiner unloaded.
[Fooocus] Processing prompts ...
[Fooocus] Preparing Fooocus text #1 ...
Segmentation fault (core dumped)
I have a similar problem, it seems like the swap is not being used or found. I am using an Nvidia 3090 but when forcing cpu only an error pops up about not finding virtual memory, i think this problem is related.
Running normally:
[System ARGV] ['launch.py']
Python 3.10.13 (main, Sep 11 2023, 13:44:35) [GCC 11.2.0]
Fooocus version: 2.1.860
Running on local URL: http://127.0.0.1:7865
To create a public link, set `share=True` in `launch()`.
Total VRAM 24257 MB, total RAM 15912 MB
Set vram state to: NORMAL_VRAM
Always offload VRAM
Device: cuda:0 NVIDIA GeForce RTX 3090 : native
VAE dtype: torch.bfloat16
Using pytorch cross attention
Refiner unloaded.
model_type EPS
UNet ADM Dimension 2816
Using pytorch attention in VAE
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
Using pytorch attention in VAE
extra {'cond_stage_model.clip_g.transformer.text_model.embeddings.position_ids', 'cond_stage_model.clip_l.logit_scale', 'cond_stage_model.clip_l.text_projection'}
Base model loaded: /home/myName/Documents/img-gen/Fooocus/models/checkpoints/juggernautXL_version6Rundiffusion.safetensors
Request to load LoRAs [['sd_xl_offset_example-lora_1.0.safetensors', 0.1], ['None', 1.0], ['None', 1.0], ['None', 1.0], ['None', 1.0]] for model [/home/myName/Documents/img-gen/Fooocus/models/checkpoints/juggernautXL_version6Rundiffusion.safetensors].
Loaded LoRA [/home/myName/Documents/img-gen/Fooocus/models/loras/sd_xl_offset_example-lora_1.0.safetensors] for UNet [/home/myName/Documents/img-gen/Fooocus/models/checkpoints/juggernautXL_version6Rundiffusion.safetensors] with 788 keys at weight 0.1.
Fooocus V2 Expansion: Vocab with 642 words.
Fooocus Expansion engine loaded for cuda:0, use_fp16 = True.
Requested to load SDXLClipModel
Requested to load GPT2LMHeadModel
Loading 2 new models
[Fooocus Model Management] Moving model(s) has taken 0.34 seconds
App started successful. Use the app with http://127.0.0.1:7865/ or 127.0.0.1:7865
[Parameters] Adaptive CFG = 7
[Parameters] Sharpness = 2
[Parameters] ADM Scale = 1.5 : 0.8 : 0.3
[Parameters] CFG = 4.0
[Parameters] Seed = 5428285024980375409
[Parameters] Sampler = dpmpp_2m_sde_gpu - karras
[Parameters] Steps = 30 - 15
[Fooocus] Initializing ...
[Fooocus] Loading models ...
Refiner unloaded.
[Fooocus] Processing prompts ...
[Fooocus] Preparing Fooocus text #1 ...
[Prompt Expansion] brown horse on the beach, intricate, elegant, highly detailed, wonderful colors, sweet, extremely delicate, majestic, holy, dramatic, sharp focus, professional composition, fantastic, iconic, fine light, excellent, very inspirational, ambient, artistic, vibrant, imposing, epic, thought, magnificent, stunning, awesome, cinematic, dynamic, complex, amazing, creative, brilliant
[Fooocus] Preparing Fooocus text #2 ...
[Prompt Expansion] brown horse on the beach, intricate, elegant, highly detailed, extremely shiny, wonderful colors, ambient light, dynamic background, sharp focus, professional fine detail, best animated, cinematic, singular, rich, vivid, beautiful, unique, cute, attractive, epic, gorgeous, stunning, great, awesome, amazing, breathtaking, dramatic, illuminated, outstanding, very coherent, perfect
[Fooocus] Encoding positive #1 ...
[Fooocus] Encoding positive #2 ...
[Fooocus] Encoding negative #1 ...
[Fooocus] Encoding negative #2 ...
[Parameters] Denoising Strength = 1.0
[Parameters] Initial Latent shape: Image Space (896, 1152)
Preparation time: 2.52 seconds
[Sampler] refiner_swap_method = joint
[Sampler] sigma_min = 0.0291671771556139, sigma_max = 14.614643096923828
Requested to load SDXL
Loading 1 new model
[Fooocus Model Management] Moving model(s) has taken 1.93 seconds
0%| | 0/30 [00:00<?, ?it/s]
Segmentation fault (core dumped)
Running cpu only:
(fooocus) myName@pop-os:~/Documents/img-gen/Fooocus$ python entry_with_update.py --preview-option fast --always-cpu
Already up-to-date
Update succeeded.
[System ARGV] ['entry_with_update.py', '--preview-option', 'fast', '--always-cpu']
Python 3.10.13 (main, Sep 11 2023, 13:44:35) [GCC 11.2.0]
Fooocus version: 2.1.860
Running on local URL: http://127.0.0.1:7865
To create a public link, set `share=True` in `launch()`.
Total VRAM 15912 MB, total RAM 15912 MB
Set vram state to: DISABLED
Always offload VRAM
Device: cpu
VAE dtype: torch.float32
Using sub quadratic optimization for cross attention, if you have memory or speed issues try using: --attention-split
Refiner unloaded.
model_type EPS
UNet ADM Dimension 2816
Using split attention in VAE
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
Using split attention in VAE
extra {'cond_stage_model.clip_l.text_projection', 'cond_stage_model.clip_g.transformer.text_model.embeddings.position_ids', 'cond_stage_model.clip_l.logit_scale'}
Base model loaded: /home/myName/Documents/img-gen/Fooocus/models/checkpoints/juggernautXL_version6Rundiffusion.safetensors
Request to load LoRAs [['sd_xl_offset_example-lora_1.0.safetensors', 0.1], ['None', 1.0], ['None', 1.0], ['None', 1.0], ['None', 1.0]] for model [/home/myName/Documents/img-gen/Fooocus/models/checkpoints/juggernautXL_version6Rundiffusion.safetensors].
Loaded LoRA [/home/myName/Documents/img-gen/Fooocus/models/loras/sd_xl_offset_example-lora_1.0.safetensors] for UNet [/home/myName/Documents/img-gen/Fooocus/models/checkpoints/juggernautXL_version6Rundiffusion.safetensors] with 788 keys at weight 0.1.
Fooocus V2 Expansion: Vocab with 642 words.
Fooocus Expansion engine loaded for cpu, use_fp16 = False.
Requested to load SDXLClipModel
Requested to load GPT2LMHeadModel
Loading 2 new models
App started successful. Use the app with http://127.0.0.1:7865/ or 127.0.0.1:7865
[Parameters] Adaptive CFG = 7
[Parameters] Sharpness = 2
[Parameters] ADM Scale = 1.5 : 0.8 : 0.3
[Parameters] CFG = 4.0
[Parameters] Seed = 8454048247502736915
[Parameters] Sampler = dpmpp_2m_sde_gpu - karras
[Parameters] Steps = 30 - 15
[Fooocus] Initializing ...
[Fooocus] Loading models ...
Refiner unloaded.
[Fooocus] Processing prompts ...
[Fooocus] Preparing Fooocus text #1 ...
[Prompt Expansion] brown horse on the beach, cinematic, epic, dramatic ambient, professional, highly detailed, extremely beautiful, emotional, cute, symmetry, intricate, light, surreal, pretty, inspiring, elegant, crisp sharp focus, artistic, very inspirational,, novel, romantic, new, cheerful, inspired, generous, color, cool, passionate, vibrant, background, colorful, shiny
[Fooocus] Preparing Fooocus text #2 ...
[Prompt Expansion] brown horse on the beach, intricate, elegant, highly detailed, extremely beautiful, glowing, sharp focus, refined, complex, colors, cinematic, surreal, artistic, scenic, attractive, thought, singular, iconic, fine detail, clear, ambient light, full color, perfect composition, symmetry, aesthetic, great, pure, pristine, very inspirational, professional, winning, best
[Fooocus] Encoding positive #1 ...
[Fooocus] Encoding positive #2 ...
[Fooocus] Encoding negative #1 ...
[Fooocus] Encoding negative #2 ...
[Parameters] Denoising Strength = 1.0
[Parameters] Initial Latent shape: Image Space (896, 1152)
Preparation time: 12.63 seconds
[Sampler] refiner_swap_method = joint
[Sampler] sigma_min = 0.0291671771556139, sigma_max = 14.614643096923828
Requested to load SDXL
Loading 1 new model
[Fooocus Model Management] Moving model(s) has taken 119.24 seconds
0%| | 0/30 [00:00<?, ?it/s]
/home/myName/anaconda3/envs/fooocus/lib/python3.10/site-packages/psutil/__init__.py:1973: RuntimeWarning: available memory stats couldn't be determined and was set to 0
ret = _psplatform.virtual_memory()
7%|██████████ | 2/30 [07:04<1:37:07, 208.12s/it]^CKeyboard interruption in main thread... closing server.
Nvidia-smi, driver & cuda versions. (which should be compatible with the current torch version.
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 545.29.06 Driver Version: 545.29.06 CUDA Version: 12.3 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce RTX 3090 Off | 00000000:08:00.0 On | N/A |
| 0% 18C P8 18W / 350W | 752MiB / 24576MiB | 10% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| 0 N/A N/A 2320 G /usr/lib/xorg/Xorg 245MiB |
| 0 N/A N/A 2430 G /usr/bin/gnome-shell 110MiB |
| 0 N/A N/A 3106 G ...sion,SpareRendererForSitePerProcess 52MiB |
| 0 N/A N/A 3334 G firefox 325MiB |
+---------------------------------------------------------------------------------------+
EDIT:
After downgrading drivers to 535.129.03
just to be sure the results remain the same.
I looked in the docs and in other issues for how to go about debugging this but that is not clear to me, i'd love to help contribute if there is some resources i can start with.
I have same issue. (Ryzen R9 7900X RAM 64 GB of which 16GB of VRAM for integrated GPU). When I run on linux I have the same issue exactly. On windows (I dual boot), it runs more or less OK (I have sometimes a crash because of the memory leak issue but it works). Could it be linked to the version of rocm that is different in the instructions of linux/AMD (5.6 instead of 5.7) ? I have read that it does not go well with VAE version (?) I tried a manual upgrade of ROCM but it caused other problems. I found interresting on the same subject: https://www.reddit.com/r/comfyui/comments/15b8lxd/comfyui_is_not_detecting_my_gpus_vram/
Hello, Same issue here.
Working with --always-cpu
Feel free to tell if I can do some test or provide more information.
Hello, I had the same issue with the segfault on the following hardware:
I managed to make it run doing the following:
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm5.6
)HSA_OVERRIDE_GFX_VERSION=11.0.0 python entry_with_update.py
(sourced here: https://github.com/lllyasviel/Fooocus/issues/1019)I can't test if it works on the RX 6000 series as my previous card is fried.
Hope this helps :smile:
sadly this does not work for me...
Memory access fault by GPU node-1 (Agent handle: 0x7f6c79b37c80) on address 0x7f6d60e85000. Reason: Page not present or supervisor privilege.
ok, could make it run with lowvram option, but it never finishes generation of any images
Thank you @Athoir, but exactly same as @carnager, it run with --attention-split
&& --always-low-vram
options, but fail shorty after the begining of the image generation :
Memory access fault by GPU node-1 (Agent handle: 0x7facdd668c60) on address 0x7fae2da8b000. Reason: Page not present or supervisor privilege. Abandon (core dumped)
Complete steps to reproduce on Fedora / Nobara :
$ sudo dnf install python3.10 rocm-opencl rocm-hip-runtime
$ python3.10 -m venv fooocus_env
$ source fooocus_env/bin/activate
$ pip install -r requirements_versions.txt
$ pip uninstall torch torchvision torchaudio torchtext functorch xformers
$ pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm5.6
$ HSA_OVERRIDE_GFX_VERSION=11.0.0 python entry_with_update.py --attention-split --always-low-vram
Perhaps related to torch version ? (see https://github.com/AUTOMATIC1111/stable-diffusion-webui/issues/8139#issuecomment-1545521725)
For RX 6700 XT, setting HSA_OVERRIDE_GFX_VERSION=10.3.0
helped, as mentioned here
For RX 6700 XT, setting
HSA_OVERRIDE_GFX_VERSION=10.3.0
helped, as mentioned here
not for me....
[Fooocus] Processing prompts ...
[Fooocus] Preparing Fooocus text #1 ...
[Prompt Expansion] cosy bear reading a book, warm colors, cinematic, highly detailed, incredible quality, very inspirational, thought, fancy, epic, singular background, elegant, intricate, dynamic light, beautiful, enhanced, bright, colorful, color, illuminated, inspired, deep rich vivid, coherent, glowing, complex, amazing, symmetry, full composed, brilliant, perfect composition, pure
[Fooocus] Preparing Fooocus text #2 ...
[Prompt Expansion] cosy bear reading a book, light flowing magic, cool colors, glowing, amazing, highly detailed, intricate, sharp focus, professional animated, vivid, best, contemporary, modern, romantic, inspired, new, creative, beautiful, attractive, advanced, cinematic, artistic color, surreal, emotional, cute, adorable, perfect, focused, positive, exciting, lucid, joyful
[Fooocus] Encoding positive #1 ...
[Fooocus] Encoding positive #2 ...
[Fooocus] Encoding negative #1 ...
[Fooocus] Encoding negative #2 ...
[Parameters] Denoising Strength = 1.0
[Parameters] Initial Latent shape: Image Space (896, 1152)
Preparation time: 5.71 seconds
[Sampler] refiner_swap_method = joint
[Sampler] sigma_min = 0.0291671771556139, sigma_max = 14.614643096923828
and then nothing happens
Working FAST with HSA_OVERRIDE_GFX_VERSION=10.3.0
(with and without --attention-split
) !
Many thanks @OronDF343
Also RX 6700 XT user, using HSA_OVERRIDE_GFX_VERSION=10.3.0 helped
Also RX 6700 XT user, using HSA_OVERRIDE_GFX_VERSION=10.3.0 helped
I confirm, 6600 XT, this solve the problem
I'm Runing here without no problems using this GIST: you need to flag HSA_OVERRIDE_GFX_VERSION=10.3.0 to Radeon 6000 https://gist.github.com/hqnicolas/5fbb9c37dcfc29c9a0ffe50fbcb35bdd
Read Troubleshoot
[x] I admit that I have read the Troubleshoot before making this issue.
Describe the problem I installed fooocus on linux using the instructions on the main page. I uninstalled regular torch and installed the amd version as mentioned on front page. I created a 40GB swap space and then ran the app with
python launch.py --attention-split
When i try to issue a image generation it seems to do something but then segfaults.some info about my setup:
Full Console Log