Closed yorai1212 closed 1 year ago
I second this, but I am jealous that you have 25x faster Total time... cmd log.txt
I second this, but I am jealous that you have 25x faster Total time... cmd log.txt
Haha, let's first all get an image generated, who knows what's happening here. Not sure those 10.15 seconds actually generated anything. No idea what's happening here.
Amd 8gb cannot be used
C:\Users\lz\Downloads\Fooocus_win64_2-1-25>.\python_embeded\python.exe -m pip uninstall torch torchvision torchaudio torchtext functorch xformers -y Found existing installation: torch 2.0.0 Uninstalling torch-2.0.0: Successfully uninstalled torch-2.0.0 Found existing installation: torchvision 0.15.1 Uninstalling torchvision-0.15.1: Successfully uninstalled torchvision-0.15.1 WARNING: Skipping torchaudio as it is not installed. WARNING: Skipping torchtext as it is not installed. WARNING: Skipping functorch as it is not installed. WARNING: Skipping xformers as it is not installed.
C:\Users\lz\Downloads\Fooocus_win64_2-1-25>.\python_embeded\python.exe -m pip install torch-directml Looking in indexes: https://mirrors.aliyun.com/pypi/simple/ Requirement already satisfied: torch-directml in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (0.2.0.dev230426) Collecting torch==2.0.0 (from torch-directml) Using cached https://mirrors.aliyun.com/pypi/packages/87/e2/62dbdfc85d3b8f771bc4b1a979ee6a157dbaa8928981dabbf45afc6d13dc/torch-2.0.0-cp310-cp310-win_amd64.whl (172.3 MB) Collecting torchvision==0.15.1 (from torch-directml) Using cached https://mirrors.aliyun.com/pypi/packages/03/06/6ba7532c66397defffb79f64cac46f812a29b2f87a4ad65a3e95bc164d62/torchvision-0.15.1-cp310-cp310-win_amd64.whl (1.2 MB) Requirement already satisfied: filelock in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from torch==2.0.0->torch-directml) (3.12.2) Requirement already satisfied: typing-extensions in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from torch==2.0.0->torch-directml) (4.7.1) Requirement already satisfied: sympy in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from torch==2.0.0->torch-directml) (1.12) Requirement already satisfied: networkx in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from torch==2.0.0->torch-directml) (3.1) Requirement already satisfied: jinja2 in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from torch==2.0.0->torch-directml) (3.1.2) Requirement already satisfied: numpy in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from torchvision==0.15.1->torch-directml) (1.23.5) Requirement already satisfied: requests in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from torchvision==0.15.1->torch-directml) (2.31.0) Requirement already satisfied: pillow!=8.3.,>=5.3.0 in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from torchvision==0.15.1->torch-directml) (9.2.0) Requirement already satisfied: MarkupSafe>=2.0 in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from jinja2->torch==2.0.0->torch-directml) (2.1.3) Requirement already satisfied: charset-normalizer<4,>=2 in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from requests->torchvision==0.15.1->torch-directml) (3.1.0) Requirement already satisfied: idna<4,>=2.5 in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from requests->torchvision==0.15.1->torch-directml) (3.4) Requirement already satisfied: urllib3<3,>=1.21.1 in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from requests->torchvision==0.15.1->torch-directml) (2.0.3) Requirement already satisfied: certifi>=2017.4.17 in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from requests->torchvision==0.15.1->torch-directml) (2023.5.7) Requirement already satisfied: mpmath>=0.19 in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from sympy->torch==2.0.0->torch-directml) (1.3.0) DEPRECATION: torchsde 0.2.5 has a non-standard dependency specifier numpy>=1.19.; python_version >= "3.7". pip 23.3 will enforce this behaviour change. A possible replacement is to upgrade to a newer version of torchsde or contact the author to suggest that they release a version with a conforming dependency specifiers. Discussion can be found at https://github.com/pypa/pip/issues/12063 Installing collected packages: torch, torchvision WARNING: The scripts convert-caffe2-to-onnx.exe, convert-onnx-to-caffe2.exe and torchrun.exe are installed in 'C:\Users\lz\Downloads\Fooocus_win64_2-1-25\python_embeded\Scripts' which is not on PATH. Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location. Successfully installed torch-2.0.0 torchvision-0.15.1
C:\Users\lz\Downloads\Fooocus_win64_2-1-25>.\python_embeded\python.exe -s Fooocus\entry_with_update.py --directml Already up-to-date Update succeeded. Python 3.10.9 (tags/v3.10.9:1dd9be6, Dec 6 2022, 20:01:21) [MSC v.1934 64 bit (AMD64)] Fooocus version: 2.1.37 Inference Engine exists and URL is correct. Inference Engine checkout finished for d1a0abd40b86f3f079b0cc71e49f9f4604831457. Running on local URL: http://127.0.0.1:7860
To create a public link, set share=True
in launch()
.
Using directml with device:
Total VRAM 1024 MB, total RAM 32688 MB
Set vram state to: NORMAL_VRAM
Device: privateuseone
VAE dtype: torch.float32
Using sub quadratic optimization for cross attention, if you have memory or speed issues try using: --use-split-cross-attention
model_type EPS
adm 2560
Refiner model loaded: C:\Users\lz\Downloads\Fooocus_win64_2-1-25\Fooocus\models\checkpoints\sd_xl_refiner_1.0_0.9vae.safetensors
model_type EPS
adm 2816
making attention of type 'vanilla' with 512 in_channels
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
making attention of type 'vanilla' with 512 in_channels
missing {'cond_stage_model.clip_l.text_projection', 'cond_stage_model.clip_g.transformer.text_model.embeddings.position_ids', 'cond_stage_model.clip_l.logit_scale'}
Base model loaded: C:\Users\lz\Downloads\Fooocus_win64_2-1-25\Fooocus\models\checkpoints\sd_xl_base_1.0_0.9vae.safetensors
LoRAs loaded: [('sd_xl_offset_example-lora_1.0.safetensors', 0.5), ('None', 0.5), ('None', 0.5), ('None', 0.5), ('None', 0.5)]
Fooocus Expansion engine loaded for privateuseone:0, use_fp16 = False.
loading new
App started successful. Use the app with http://127.0.0.1:7860/ or 127.0.0.1:7860
[Parameters] Adaptive CFG = 7
[Parameters] Sharpness = 0.812
[Parameters] ADM Scale = 1.5 : 0.8 : 0.3
[Parameters] CFG = 3.06
[Parameters] Sampler = dpmpp_2m_sde_gpu - karras
[Parameters] Steps = 30 - 20
[Fooocus] Initializing ...
[Fooocus] Loading models ...
[Fooocus] Processing prompts ...
[Fooocus] Preparing Fooocus text #1 ...
C:\Users\lz\Downloads\Fooocus_win64_2-1-25\python_embeded\lib\site-packages\transformers\generation\utils.py:723: UserWarning: The operator 'aten::repeat_interleave.Tensor' is not currently supported on the DML backend and will fall back to run on the CPU. This may have performance implications. (Triggered internally at D:\a_work\1\s\pytorch-directml-plugin\torch_directml\csrc\dml\dml_cpu_fallback.cpp:17.)
input_ids = input_ids.repeat_interleave(expand_size, dim=0)
[Prompt Expansion] New suffix: intricate, highly detailed, digital painting, artstation, concept art, smooth, sharp focus, illustration, Unreal Engine 5, 8K, art by artgerm and greg rutkowski and alphonse mucha
[Fooocus] Preparing Fooocus text #2 ...
[Prompt Expansion] New suffix: extremely detailed eyes. By Makoto Shinkai, Stanley Artgerm Lau, WLOP, Rossdraws, James Jean, Andrei Riabovitchev, Marc Simonetti, krenz cushart, Sakimichan, D&D trending on ArtStation, digital art
[Fooocus] Encoding positive #1 ...
[Fooocus] Encoding positive #2 ...
[Fooocus] Encoding negative #1 ...
[Fooocus] Encoding negative #2 ...
Preparation time: 8.80 seconds
loading new
ERROR diffusion_model.output_blocks.1.1.transformer_blocks.2.ff.net.0.proj.weight Could not allocate tensor with 52428800 bytes. There is not enough GPU video memory available!
Traceback (most recent call last):
File "C:\Users\lz\Downloads\Fooocus_win64_2-1-25\Fooocus\modules\async_worker.py", line 565, in worker
handler(task)
File "C:\Users\lz\Downloads\Fooocus_win64_2-1-25\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, *kwargs)
File "C:\Users\lz\Downloads\Fooocus_win64_2-1-25\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(args, **kwargs)
File "C:\Users\lz\Downloads\Fooocus_win64_2-1-25\Fooocus\modules\async_worker.py", line 470, in handler
comfy.model_management.load_models_gpu([pipeline.final_unet])
File "C:\Users\lz\Downloads\Fooocus_win64_2-1-25\Fooocus\repositories\ComfyUI-from-StabilityAI-Official\comfy\model_management.py", line 397, in load_models_gpu
cur_loaded_model = loaded_model.model_load(lowvram_model_memory)
File "C:\Users\lz\Downloads\Fooocus_win64_2-1-25\Fooocus\repositories\ComfyUI-from-StabilityAI-Official\comfy\model_management.py", line 286, in model_load
raise e
File "C:\Users\lz\Downloads\Fooocus_win64_2-1-25\Fooocus\repositories\ComfyUI-from-StabilityAI-Official\comfy\model_management.py", line 282, in model_load
self.real_model = self.model.patch_model(device_to=patch_model_to) #TODO: do something with loras and offloading to CPU
File "C:\Users\lz\Downloads\Fooocus_win64_2-1-25\Fooocus\repositories\ComfyUI-from-StabilityAI-Official\comfy\model_patcher.py", line 161, in patch_model
temp_weight = comfy.model_management.cast_to_device(weight, device_to, torch.float32, copy=True)
File "C:\Users\lz\Downloads\Fooocus_win64_2-1-25\Fooocus\repositories\ComfyUI-from-StabilityAI-Official\comfy\model_management.py", line 498, in cast_to_device
return tensor.to(device, copy=copy).to(dtype)
RuntimeError: Could not allocate tensor with 26214400 bytes. There is not enough GPU video memory available!
Total time: 37.06 seconds
.\python_embeded\Lib\site-packages\torchsde_brownian\brownian_interval.py
Line 32
generator = torch.Generator(device).manual_seed(int(seed))
to
generator = torch.Generator().manual_seed(int(seed))
torchsde_brownian
Don't have that folder..
Also, is it safe to edit the files without lllyasviel confirming we should do that?
Edit: found the file in C:\Fooocus_win64_2-1-25\python_embeded\Lib\site-packages\torchsde_brownian. Works fine now! thank you!
I solved the issue by editing this file:
.\python_embeded\Lib\site-packages\torchsde_brownian\brownian_interval.py
Line 32
generator = torch.Generator(device).manual_seed(int(seed))
to
generator = torch.Generator().manual_seed(int(seed))
this worked for me too, thank you so much!
There is no torchsde_brownian. What is happening here?
it's torchsde_brownian
put a "\" between "torchsde" and "_brownian"
-\\\-
put a \ between "torchsde" and "_brownian"
Okay no need to spam LMAO
Amd 8gb cannot be used
C:\Users\lz\Downloads\Fooocus_win64_2-1-25>.\python_embeded\python.exe -m pip uninstall torch torchvision torchaudio torchtext functorch xformers -y Found existing installation: torch 2.0.0 Uninstalling torch-2.0.0: Successfully uninstalled torch-2.0.0 Found existing installation: torchvision 0.15.1 Uninstalling torchvision-0.15.1: Successfully uninstalled torchvision-0.15.1 WARNING: Skipping torchaudio as it is not installed. WARNING: Skipping torchtext as it is not installed. WARNING: Skipping functorch as it is not installed. WARNING: Skipping xformers as it is not installed.
C:\Users\lz\Downloads\Fooocus_win64_2-1-25>.\python_embeded\python.exe -m pip install torch-directml Looking in indexes: https://mirrors.aliyun.com/pypi/simple/ Requirement already satisfied: torch-directml in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (0.2.0.dev230426) Collecting torch==2.0.0 (from torch-directml) Using cached https://mirrors.aliyun.com/pypi/packages/87/e2/62dbdfc85d3b8f771bc4b1a979ee6a157dbaa8928981dabbf45afc6d13dc/torch-2.0.0-cp310-cp310-win_amd64.whl (172.3 MB) Collecting torchvision==0.15.1 (from torch-directml) Using cached https://mirrors.aliyun.com/pypi/packages/03/06/6ba7532c66397defffb79f64cac46f812a29b2f87a4ad65a3e95bc164d62/torchvision-0.15.1-cp310-cp310-win_amd64.whl (1.2 MB) Requirement already satisfied: filelock in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from torch==2.0.0->torch-directml) (3.12.2) Requirement already satisfied: typing-extensions in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from torch==2.0.0->torch-directml) (4.7.1) Requirement already satisfied: sympy in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from torch==2.0.0->torch-directml) (1.12) Requirement already satisfied: networkx in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from torch==2.0.0->torch-directml) (3.1) Requirement already satisfied: jinja2 in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from torch==2.0.0->torch-directml) (3.1.2) Requirement already satisfied: numpy in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from torchvision==0.15.1->torch-directml) (1.23.5) Requirement already satisfied: requests in c:\users\lz\downloads\fooocus_win64_2-1-25\pythonembeded\lib\site-packages (from torchvision==0.15.1->torch-directml) (2.31.0) Requirement already satisfied: pillow!=8.3.,>=5.3.0 in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from torchvision==0.15.1->torch-directml) (9.2.0) Requirement already satisfied: MarkupSafe>=2.0 in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from jinja2->torch==2.0.0->torch-directml) (2.1.3) Requirement already satisfied: charset-normalizer<4,>=2 in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from requests->torchvision==0.15.1->torch-directml) (3.1.0) Requirement already satisfied: idna<4,>=2.5 in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from requests->torchvision==0.15.1->torch-directml) (3.4) Requirement already satisfied: urllib3<3,>=1.21.1 in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from requests->torchvision==0.15.1->torch-directml) (2.0.3) Requirement already satisfied: certifi>=2017.4.17 in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from requests->torchvision==0.15.1->torch-directml) (2023.5.7) Requirement already satisfied: mpmath>=0.19 in c:\users\lz\downloads\fooocus_win64_2-1-25\pythonembeded\lib\site-packages (from sympy->torch==2.0.0->torch-directml) (1.3.0) DEPRECATION: torchsde 0.2.5 has a non-standard dependency specifier numpy>=1.19.; python_version >= "3.7". pip 23.3 will enforce this behaviour change. A possible replacement is to upgrade to a newer version of torchsde or contact the author to suggest that they release a version with a conforming dependency specifiers. Discussion can be found at pypa/pip#12063 Installing collected packages: torch, torchvision WARNING: The scripts convert-caffe2-to-onnx.exe, convert-onnx-to-caffe2.exe and torchrun.exe are installed in 'C:\Users\lz\Downloads\Fooocus_win64_2-1-25\python_embeded\Scripts' which is not on PATH. Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location. Successfully installed torch-2.0.0 torchvision-0.15.1
C:\Users\lz\Downloads\Fooocus_win64_2-1-25>.\python_embeded\python.exe -s Fooocus\entry_with_update.py --directml Already up-to-date Update succeeded. Python 3.10.9 (tags/v3.10.9:1dd9be6, Dec 6 2022, 20:01:21) [MSC v.1934 64 bit (AMD64)] Fooocus version: 2.1.37 Inference Engine exists and URL is correct. Inference Engine checkout finished for d1a0abd40b86f3f079b0cc71e49f9f4604831457. Running on local URL: http://127.0.0.1:7860
To create a public link, set
share=True
inlaunch()
. Using directml with device: Total VRAM 1024 MB, total RAM 32688 MB Set vram state to: NORMAL_VRAM Device: privateuseone VAE dtype: torch.float32 Using sub quadratic optimization for cross attention, if you have memory or speed issues try using: --use-split-cross-attention model_type EPS adm 2560 Refiner model loaded: C:\Users\lz\Downloads\Fooocus_win64_2-1-25\Fooocus\models\checkpoints\sd_xl_refiner_1.0_0.9vae.safetensors model_type EPS adm 2816 making attention of type 'vanilla' with 512 in_channels Working with z of shape (1, 4, 32, 32) = 4096 dimensions. making attention of type 'vanilla' with 512 in_channels missing {'cond_stage_model.clip_l.text_projection', 'cond_stage_model.clip_g.transformer.text_model.embeddings.position_ids', 'cond_stage_model.clip_l.logit_scale'} Base model loaded: C:\Users\lz\Downloads\Fooocus_win64_2-1-25\Fooocus\models\checkpoints\sd_xl_base_1.0_0.9vae.safetensors LoRAs loaded: [('sd_xl_offset_example-lora_1.0.safetensors', 0.5), ('None', 0.5), ('None', 0.5), ('None', 0.5), ('None', 0.5)] Fooocus Expansion engine loaded for privateuseone:0, use_fp16 = False. loading new App started successful. Use the app with http://127.0.0.1:7860/ or 127.0.0.1:7860 [Parameters] Adaptive CFG = 7 [Parameters] Sharpness = 0.812 [Parameters] ADM Scale = 1.5 : 0.8 : 0.3 [Parameters] CFG = 3.06 [Parameters] Sampler = dpmpp_2m_sde_gpu - karras [Parameters] Steps = 30 - 20 [Fooocus] Initializing ... [Fooocus] Loading models ... [Fooocus] Processing prompts ... [Fooocus] Preparing Fooocus text #1 ... C:\Users\lz\Downloads\Fooocus_win64_2-1-25\python_embeded\lib\site-packages\transformers\generation\utils.py:723: UserWarning: The operator 'aten::repeat_interleave.Tensor' is not currently supported on the DML backend and will fall back to run on the CPU. This may have performance implications. (Triggered internally at D:\a_work\1\s\pytorch-directml-plugin\torch_directml\csrc\dml\dml_cpu_fallback.cpp:17.) input_ids = input_ids.repeat_interleave(expand_size, dim=0) [Prompt Expansion] New suffix: intricate, highly detailed, digital painting, artstation, concept art, smooth, sharp focus, illustration, Unreal Engine 5, 8K, art by artgerm and greg rutkowski and alphonse mucha [Fooocus] Preparing Fooocus text #2 ... [Prompt Expansion] New suffix: extremely detailed eyes. By Makoto Shinkai, Stanley Artgerm Lau, WLOP, Rossdraws, James Jean, Andrei Riabovitchev, Marc Simonetti, krenz cushart, Sakimichan, D&D trending on ArtStation, digital art [Fooocus] Encoding positive #1 ... [Fooocus] Encoding positive #2 ... [Fooocus] Encoding negative #1 ... [Fooocus] Encoding negative #2 ... Preparation time: 8.80 seconds loading new ERROR diffusion_model.output_blocks.1.1.transformer_blocks.2.ff.net.0.proj.weight Could not allocate tensor with 52428800 bytes. There is not enough GPU video memory available! Traceback (most recent call last): File "C:\Users\lz\Downloads\Fooocus_win64_2-1-25\Fooocus\modules\async_worker.py", line 565, in worker handler(task) File "C:\Users\lz\Downloads\Fooocus_win64_2-1-25\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context return func(*args, *kwargs) File "C:\Users\lz\Downloads\Fooocus_win64_2-1-25\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context return func(args, **kwargs) File "C:\Users\lz\Downloads\Fooocus_win64_2-1-25\Fooocus\modules\async_worker.py", line 470, in handler comfy.model_management.load_models_gpu([pipeline.final_unet]) File "C:\Users\lz\Downloads\Fooocus_win64_2-1-25\Fooocus\repositories\ComfyUI-from-StabilityAI-Official\comfy\model_management.py", line 397, in load_models_gpu cur_loaded_model = loaded_model.model_load(lowvram_model_memory) File "C:\Users\lz\Downloads\Fooocus_win64_2-1-25\Fooocus\repositories\ComfyUI-from-StabilityAI-Official\comfy\model_management.py", line 286, in model_load raise e File "C:\Users\lz\Downloads\Fooocus_win64_2-1-25\Fooocus\repositories\ComfyUI-from-StabilityAI-Official\comfy\model_management.py", line 282, in model_load self.real_model = self.model.patch_model(device_to=patch_model_to) #TODO: do something with loras and offloading to CPU File "C:\Users\lz\Downloads\Fooocus_win64_2-1-25\Fooocus\repositories\ComfyUI-from-StabilityAI-Official\comfy\model_patcher.py", line 161, in patch_model temp_weight = comfy.model_management.cast_to_device(weight, device_to, torch.float32, copy=True) File "C:\Users\lz\Downloads\Fooocus_win64_2-1-25\Fooocus\repositories\ComfyUI-from-StabilityAI-Official\comfy\model_management.py", line 498, in cast_to_device return tensor.to(device, copy=copy).to(dtype) RuntimeError: Could not allocate tensor with 26214400 bytes. There is not enough GPU video memory available! Total time: 37.06 seconds
me too!
same here
Amd 8gb cannot be used C:\Users\lz\Downloads\Fooocus_win64_2-1-25>.\python_embeded\python.exe -m pip uninstall torch torchvision torchaudio torchtext functorch xformers -y Found existing installation: torch 2.0.0 Uninstalling torch-2.0.0: Successfully uninstalled torch-2.0.0 Found existing installation: torchvision 0.15.1 Uninstalling torchvision-0.15.1: Successfully uninstalled torchvision-0.15.1 WARNING: Skipping torchaudio as it is not installed. WARNING: Skipping torchtext as it is not installed. WARNING: Skipping functorch as it is not installed. WARNING: Skipping xformers as it is not installed. C:\Users\lz\Downloads\Fooocus_win64_2-1-25>.\python_embeded\python.exe -m pip install torch-directml Looking in indexes: https://mirrors.aliyun.com/pypi/simple/ Requirement already satisfied: torch-directml in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (0.2.0.dev230426) Collecting torch==2.0.0 (from torch-directml) Using cached https://mirrors.aliyun.com/pypi/packages/87/e2/62dbdfc85d3b8f771bc4b1a979ee6a157dbaa8928981dabbf45afc6d13dc/torch-2.0.0-cp310-cp310-win_amd64.whl (172.3 MB) Collecting torchvision==0.15.1 (from torch-directml) Using cached https://mirrors.aliyun.com/pypi/packages/03/06/6ba7532c66397defffb79f64cac46f812a29b2f87a4ad65a3e95bc164d62/torchvision-0.15.1-cp310-cp310-win_amd64.whl (1.2 MB) Requirement already satisfied: filelock in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from torch==2.0.0->torch-directml) (3.12.2) Requirement already satisfied: typing-extensions in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from torch==2.0.0->torch-directml) (4.7.1) Requirement already satisfied: sympy in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from torch==2.0.0->torch-directml) (1.12) Requirement already satisfied: networkx in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from torch==2.0.0->torch-directml) (3.1) Requirement already satisfied: jinja2 in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from torch==2.0.0->torch-directml) (3.1.2) Requirement already satisfied: numpy in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from torchvision==0.15.1->torch-directml) (1.23.5) Requirement already satisfied: requests in c:\users\lz\downloads\fooocus_win64_2-1-25\pythonembeded\lib\site-packages (from torchvision==0.15.1->torch-directml) (2.31.0) Requirement already satisfied: pillow!=8.3.,>=5.3.0 in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from torchvision==0.15.1->torch-directml) (9.2.0) Requirement already satisfied: MarkupSafe>=2.0 in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from jinja2->torch==2.0.0->torch-directml) (2.1.3) Requirement already satisfied: charset-normalizer<4,>=2 in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from requests->torchvision==0.15.1->torch-directml) (3.1.0) Requirement already satisfied: idna<4,>=2.5 in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from requests->torchvision==0.15.1->torch-directml) (3.4) Requirement already satisfied: urllib3<3,>=1.21.1 in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from requests->torchvision==0.15.1->torch-directml) (2.0.3) Requirement already satisfied: certifi>=2017.4.17 in c:\users\lz\downloads\fooocus_win64_2-1-25\python_embeded\lib\site-packages (from requests->torchvision==0.15.1->torch-directml) (2023.5.7) Requirement already satisfied: mpmath>=0.19 in c:\users\lz\downloads\fooocus_win64_2-1-25\pythonembeded\lib\site-packages (from sympy->torch==2.0.0->torch-directml) (1.3.0) DEPRECATION: torchsde 0.2.5 has a non-standard dependency specifier numpy>=1.19.; python_version >= "3.7". pip 23.3 will enforce this behaviour change. A possible replacement is to upgrade to a newer version of torchsde or contact the author to suggest that they release a version with a conforming dependency specifiers. Discussion can be found at pypa/pip#12063 Installing collected packages: torch, torchvision WARNING: The scripts convert-caffe2-to-onnx.exe, convert-onnx-to-caffe2.exe and torchrun.exe are installed in 'C:\Users\lz\Downloads\Fooocus_win64_2-1-25\python_embeded\Scripts' which is not on PATH. Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location. Successfully installed torch-2.0.0 torchvision-0.15.1 C:\Users\lz\Downloads\Fooocus_win64_2-1-25>.\python_embeded\python.exe -s Fooocus\entry_with_update.py --directml Already up-to-date Update succeeded. Python 3.10.9 (tags/v3.10.9:1dd9be6, Dec 6 2022, 20:01:21) [MSC v.1934 64 bit (AMD64)] Fooocus version: 2.1.37 Inference Engine exists and URL is correct. Inference Engine checkout finished for d1a0abd40b86f3f079b0cc71e49f9f4604831457. Running on local URL: http://127.0.0.1:7860 To create a public link, set
share=True
inlaunch()
. Using directml with device: Total VRAM 1024 MB, total RAM 32688 MB Set vram state to: NORMAL_VRAM Device: privateuseone VAE dtype: torch.float32 Using sub quadratic optimization for cross attention, if you have memory or speed issues try using: --use-split-cross-attention model_type EPS adm 2560 Refiner model loaded: C:\Users\lz\Downloads\Fooocus_win64_2-1-25\Fooocus\models\checkpoints\sd_xl_refiner_1.0_0.9vae.safetensors model_type EPS adm 2816 making attention of type 'vanilla' with 512 in_channels Working with z of shape (1, 4, 32, 32) = 4096 dimensions. making attention of type 'vanilla' with 512 in_channels missing {'cond_stage_model.clip_l.text_projection', 'cond_stage_model.clip_g.transformer.text_model.embeddings.position_ids', 'cond_stage_model.clip_l.logit_scale'} Base model loaded: C:\Users\lz\Downloads\Fooocus_win64_2-1-25\Fooocus\models\checkpoints\sd_xl_base_1.0_0.9vae.safetensors LoRAs loaded: [('sd_xl_offset_example-lora_1.0.safetensors', 0.5), ('None', 0.5), ('None', 0.5), ('None', 0.5), ('None', 0.5)] Fooocus Expansion engine loaded for privateuseone:0, use_fp16 = False. loading new App started successful. Use the app with http://127.0.0.1:7860/ or 127.0.0.1:7860 [Parameters] Adaptive CFG = 7 [Parameters] Sharpness = 0.812 [Parameters] ADM Scale = 1.5 : 0.8 : 0.3 [Parameters] CFG = 3.06 [Parameters] Sampler = dpmpp_2m_sde_gpu - karras [Parameters] Steps = 30 - 20 [Fooocus] Initializing ... [Fooocus] Loading models ... [Fooocus] Processing prompts ... [Fooocus] Preparing Fooocus text #1 ... C:\Users\lz\Downloads\Fooocus_win64_2-1-25\python_embeded\lib\site-packages\transformers\generation\utils.py:723: UserWarning: The operator 'aten::repeat_interleave.Tensor' is not currently supported on the DML backend and will fall back to run on the CPU. This may have performance implications. (Triggered internally at D:\a_work\1\s\pytorch-directml-plugin\torch_directml\csrc\dml\dml_cpu_fallback.cpp:17.) input_ids = input_ids.repeat_interleave(expand_size, dim=0) [Prompt Expansion] New suffix: intricate, highly detailed, digital painting, artstation, concept art, smooth, sharp focus, illustration, Unreal Engine 5, 8K, art by artgerm and greg rutkowski and alphonse mucha [Fooocus] Preparing Fooocus text #2 ... [Prompt Expansion] New suffix: extremely detailed eyes. By Makoto Shinkai, Stanley Artgerm Lau, WLOP, Rossdraws, James Jean, Andrei Riabovitchev, Marc Simonetti, krenz cushart, Sakimichan, D&D trending on ArtStation, digital art [Fooocus] Encoding positive #1 ... [Fooocus] Encoding positive #2 ... [Fooocus] Encoding negative #1 ... [Fooocus] Encoding negative #2 ... Preparation time: 8.80 seconds loading new ERROR diffusion_model.output_blocks.1.1.transformer_blocks.2.ff.net.0.proj.weight Could not allocate tensor with 52428800 bytes. There is not enough GPU video memory available! Traceback (most recent call last): File "C:\Users\lz\Downloads\Fooocus_win64_2-1-25\Fooocus\modules\async_worker.py", line 565, in worker handler(task) File "C:\Users\lz\Downloads\Fooocus_win64_2-1-25\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context return func(*args, *kwargs) File "C:\Users\lz\Downloads\Fooocus_win64_2-1-25\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context return func(args, **kwargs) File "C:\Users\lz\Downloads\Fooocus_win64_2-1-25\Fooocus\modules\async_worker.py", line 470, in handler comfy.model_management.load_models_gpu([pipeline.final_unet]) File "C:\Users\lz\Downloads\Fooocus_win64_2-1-25\Fooocus\repositories\ComfyUI-from-StabilityAI-Official\comfy\model_management.py", line 397, in load_models_gpu cur_loaded_model = loaded_model.model_load(lowvram_model_memory) File "C:\Users\lz\Downloads\Fooocus_win64_2-1-25\Fooocus\repositories\ComfyUI-from-StabilityAI-Official\comfy\model_management.py", line 286, in model_load raise e File "C:\Users\lz\Downloads\Fooocus_win64_2-1-25\Fooocus\repositories\ComfyUI-from-StabilityAI-Official\comfy\model_management.py", line 282, in model_load self.real_model = self.model.patch_model(device_to=patch_model_to) #TODO: do something with loras and offloading to CPU File "C:\Users\lz\Downloads\Fooocus_win64_2-1-25\Fooocus\repositories\ComfyUI-from-StabilityAI-Official\comfy\model_patcher.py", line 161, in patch_model temp_weight = comfy.model_management.cast_to_device(weight, device_to, torch.float32, copy=True) File "C:\Users\lz\Downloads\Fooocus_win64_2-1-25\Fooocus\repositories\ComfyUI-from-StabilityAI-Official\comfy\model_management.py", line 498, in cast_to_device return tensor.to(device, copy=copy).to(dtype) RuntimeError: Could not allocate tensor with 26214400 bytes. There is not enough GPU video memory available! Total time: 37.06 secondsme too!
same.
a part i found interesting as well was the "Total VRAM 1024 MB," its almost like its grabbing the boards local or something. 🤔
also what seems like a memory leak. just having the gradio live window open it starts ramping and doesnt stop till i close it. 32gb of ram too 🤔
RuntimeError: Could not allocate tensor with 165150720 bytes. There is not enough GPU video memory available!
AMD Radeon RX 6700 XT
After applying the fix above, now getting the allocation error:
RuntimeError: Could not allocate tensor with 6553600 bytes. There is not enough GPU video memory available!
RX 5700XT, Windows 10
For memory allocation problem I found quick fix, but use with caution.
Go to \Fooocus\backend\headless\fcbh\model_management.py
In line 95 change mem_total = 1024 * 1024 * 1024
to mem_total = 8192 * 1024 * 1024
.
I have 16GB VRAM GPU, so I assigned 8GB, you can try to assign more.
Also if you have like 4GB VRAM then I would suggest to change the line to mem_total = 2048 * 1024 * 1024
it will allocate 2GB VRAM. Of course you can try to use all VRAM :)
I have do this but still memory error
mem_total = 4096 * 1024 * 1024
with a Vega 64 8g
I have do this but still memory error
mem_total = 8192 * 1024 * 1024
with a Vega 64 8g
Restart the app and check TOTAL VRAM printed out on start. In my case when I set it to 12GB it looks like this:
I'm at 4096 on VRAM
I'm at 4096 on VRAM
Looks ok, I would try generating with different settings. Lower image resolution maybe. Sorry I can't help you more.
After making both fixes in this thread, I'm still getting the allocation error.
To create a public link, set share=True in launch(). Using directml with device: Total VRAM 8192 MB, total RAM 16318 MB Set vram state to: NORMAL_VRAM ... snip ... RuntimeError: Could not allocate tensor with 10485760 bytes. There is not enough GPU video memory available!
RX 5700XT, Win 10
One image with all defaults settings took more than 4 minutes. Is this normal?
Pc specs:
Am I doing something wrong or do I have a bootleneck?
Also the fix worked:
Edit file brownian_interval.py
in \python_embeded\Lib\site-packages\torchsde\_brownian
Line 32
Change:
generator = torch.Generator(device).manual_seed(int(seed))
to:
generator = torch.Generator().manual_seed(int(seed))
@MatSkrzat Appreciate the suggestion. I tried bumping the memory
Total VRAM 8192 MB, total RAM 32700 MB
mem_total = 1024 * 12 * 1024 * 1024
Total VRAM 12288 MB, total RAM 32700 MB
Both yielded the same out of memory error
RuntimeError: Could not allocate tensor with 165150720 bytes. There is not enough GPU video memory available!
RuntimeError: Could not allocate tensor with 6553600 bytes. There is not enough GPU video memory available!
AMD Radeon RX6600, windows 10
Same "out of memory" issue here after using both fixes GPU rx7900xt 20GB
i did teh torchsde fix but still im getting this error
E:\AI\Fooocus_win64_2-1-791>.\python_embeded\python.exe -s Fooocus\entry_with_update.py --directml
Already up-to-date
Update succeeded.
[System ARGV] ['Fooocus\\entry_with_update.py', '--directml']
Python 3.10.9 (tags/v3.10.9:1dd9be6, Dec 6 2022, 20:01:21) [MSC v.1934 64 bit (AMD64)]
Fooocus version: 2.1.824
Running on local URL: http://127.0.0.1:7865
To create a public link, set `share=True` in `launch()`.
Using directml with device:
Total VRAM 1024 MB, total RAM 16310 MB
Set vram state to: NORMAL_VRAM
Disabling smart memory management
Device: privateuseone
VAE dtype: torch.float32
Using sub quadratic optimization for cross attention, if you have memory or speed issues try using: --use-split-cross-attention
Refiner unloaded.
Exception in thread Thread-2 (worker):
Traceback (most recent call last):
File "threading.py", line 1016, in _bootstrap_inner
File "threading.py", line 953, in run
File "E:\AI\Fooocus_win64_2-1-791\Fooocus\modules\async_worker.py", line 25, in worker
import modules.default_pipeline as pipeline
File "E:\AI\Fooocus_win64_2-1-791\Fooocus\modules\default_pipeline.py", line 253, in <module>
refresh_everything(
File "E:\AI\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "E:\AI\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "E:\AI\Fooocus_win64_2-1-791\Fooocus\modules\default_pipeline.py", line 233, in refresh_everything
refresh_base_model(base_model_name)
File "E:\AI\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "E:\AI\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "E:\AI\Fooocus_win64_2-1-791\Fooocus\modules\default_pipeline.py", line 69, in refresh_base_model
model_base = core.load_model(filename)
File "E:\AI\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "E:\AI\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "E:\AI\Fooocus_win64_2-1-791\Fooocus\modules\core.py", line 152, in load_model
unet, clip, vae, clip_vision = load_checkpoint_guess_config(ckpt_filename, embedding_directory=path_embeddings)
File "E:\AI\Fooocus_win64_2-1-791\Fooocus\backend\headless\fcbh\sd.py", line 446, in load_checkpoint_guess_config
model = model_config.get_model(sd, "model.diffusion_model.", device=inital_load_device)
File "E:\AI\Fooocus_win64_2-1-791\Fooocus\backend\headless\fcbh\supported_models.py", line 163, in get_model
out = model_base.SDXL(self, model_type=self.model_type(state_dict, prefix), device=device)
File "E:\AI\Fooocus_win64_2-1-791\Fooocus\backend\headless\fcbh\model_base.py", line 243, in __init__
super().__init__(model_config, model_type, device=device)
File "E:\AI\Fooocus_win64_2-1-791\Fooocus\backend\headless\fcbh\model_base.py", line 40, in __init__
self.diffusion_model = UNetModel(**unet_config, device=device)
File "E:\AI\Fooocus_win64_2-1-791\Fooocus\backend\headless\fcbh\ldm\modules\diffusionmodules\openaimodel.py", line 520, in __init__
ResBlock(
File "E:\AI\Fooocus_win64_2-1-791\Fooocus\backend\headless\fcbh\ldm\modules\diffusionmodules\openaimodel.py", line 190, in __init__
operations.conv_nd(dims, self.out_channels, self.out_channels, 3, padding=1, dtype=dtype, device=device)
File "E:\AI\Fooocus_win64_2-1-791\Fooocus\backend\headless\fcbh\ops.py", line 18, in conv_nd
return Conv2d(*args, **kwargs)
File "E:\AI\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\nn\modules\conv.py", line 450, in __init__
super().__init__(
File "E:\AI\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\nn\modules\conv.py", line 137, in __init__
self.weight = Parameter(torch.empty(
RuntimeError: [enforce fail at ..\c10\core\impl\alloc_cpu.cpp:72] data. DefaultCPUAllocator: not enough memory: you tried to allocate 58982400 bytes.
i did teh torchsde fix but still im getting this error
E:\AI\Fooocus_win64_2-1-791>.\python_embeded\python.exe -s Fooocus\entry_with_update.py --directml Already up-to-date Update succeeded. [System ARGV] ['Fooocus\\entry_with_update.py', '--directml'] Python 3.10.9 (tags/v3.10.9:1dd9be6, Dec 6 2022, 20:01:21) [MSC v.1934 64 bit (AMD64)] Fooocus version: 2.1.824 Running on local URL: http://127.0.0.1:7865 To create a public link, set `share=True` in `launch()`. Using directml with device: Total VRAM 1024 MB, total RAM 16310 MB Set vram state to: NORMAL_VRAM Disabling smart memory management Device: privateuseone VAE dtype: torch.float32 Using sub quadratic optimization for cross attention, if you have memory or speed issues try using: --use-split-cross-attention Refiner unloaded. Exception in thread Thread-2 (worker): Traceback (most recent call last): File "threading.py", line 1016, in _bootstrap_inner File "threading.py", line 953, in run File "E:\AI\Fooocus_win64_2-1-791\Fooocus\modules\async_worker.py", line 25, in worker import modules.default_pipeline as pipeline File "E:\AI\Fooocus_win64_2-1-791\Fooocus\modules\default_pipeline.py", line 253, in <module> refresh_everything( File "E:\AI\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) File "E:\AI\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) File "E:\AI\Fooocus_win64_2-1-791\Fooocus\modules\default_pipeline.py", line 233, in refresh_everything refresh_base_model(base_model_name) File "E:\AI\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) File "E:\AI\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) File "E:\AI\Fooocus_win64_2-1-791\Fooocus\modules\default_pipeline.py", line 69, in refresh_base_model model_base = core.load_model(filename) File "E:\AI\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) File "E:\AI\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) File "E:\AI\Fooocus_win64_2-1-791\Fooocus\modules\core.py", line 152, in load_model unet, clip, vae, clip_vision = load_checkpoint_guess_config(ckpt_filename, embedding_directory=path_embeddings) File "E:\AI\Fooocus_win64_2-1-791\Fooocus\backend\headless\fcbh\sd.py", line 446, in load_checkpoint_guess_config model = model_config.get_model(sd, "model.diffusion_model.", device=inital_load_device) File "E:\AI\Fooocus_win64_2-1-791\Fooocus\backend\headless\fcbh\supported_models.py", line 163, in get_model out = model_base.SDXL(self, model_type=self.model_type(state_dict, prefix), device=device) File "E:\AI\Fooocus_win64_2-1-791\Fooocus\backend\headless\fcbh\model_base.py", line 243, in __init__ super().__init__(model_config, model_type, device=device) File "E:\AI\Fooocus_win64_2-1-791\Fooocus\backend\headless\fcbh\model_base.py", line 40, in __init__ self.diffusion_model = UNetModel(**unet_config, device=device) File "E:\AI\Fooocus_win64_2-1-791\Fooocus\backend\headless\fcbh\ldm\modules\diffusionmodules\openaimodel.py", line 520, in __init__ ResBlock( File "E:\AI\Fooocus_win64_2-1-791\Fooocus\backend\headless\fcbh\ldm\modules\diffusionmodules\openaimodel.py", line 190, in __init__ operations.conv_nd(dims, self.out_channels, self.out_channels, 3, padding=1, dtype=dtype, device=device) File "E:\AI\Fooocus_win64_2-1-791\Fooocus\backend\headless\fcbh\ops.py", line 18, in conv_nd return Conv2d(*args, **kwargs) File "E:\AI\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\nn\modules\conv.py", line 450, in __init__ super().__init__( File "E:\AI\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\nn\modules\conv.py", line 137, in __init__ self.weight = Parameter(torch.empty( RuntimeError: [enforce fail at ..\c10\core\impl\alloc_cpu.cpp:72] data. DefaultCPUAllocator: not enough memory: you tried to allocate 58982400 bytes.
Same here RX 580 8GB, Ryzen 5600, 32GB RAM
Same error after doing both steps
.\python_embeded\Lib\site-packages\torchsde_brownian\brownian_interval.py
Line 32
generator = torch.Generator(device).manual_seed(int(seed))
to
generator = torch.Generator().manual_seed(int(seed))
then tried to alloctae more VRAM
Go to \Fooocus\backend\headless\fcbh\model_management.py
In line 95 change mem_total = 1024 * 1024 * 1024
to mem_total = 8192 * 1024 * 1024.
Still getting same error "could not allocate tensor with 165150720 bytes. there is not enough gpu video memory available!"
Note: I remeber messing with this file "cli_args.py" and I got it working then stopped, wanted to mention it incase someone can make use of the info
Change from
vram_group = parser.add_mutually_exclusive_group()
vram_group.add_argument("--gpu-only", action="store_true", help="Store and run everything (text encoders/CLIP models, etc... on the GPU).")
vram_group.add_argument("--highvram", action="store_true", help="By default models will be unloaded to CPU memory after being used. This option keeps them in GPU memory.")
vram_group.add_argument("--normalvram", action="store_true", help="Used to force normal vram use if lowvram gets automatically enabled.")
vram_group.add_argument("--lowvram", action="store_true", help="Split the unet in parts to use less vram.")
vram_group.add_argument("--novram", action="store_true", help="When lowvram isn't enough.")
vram_group.add_argument("--cpu", action="store_true", help="To use the CPU for everything (slow).")
To
vram_group = parser.add_mutually_exclusive_group()
vram_group.add_argument("--gpu-only", action="store_true", help="Store and run everything (text encoders/CLIP models, etc... on the GPU).")
vram_group.add_argument("--highvram", action="store_false", help="By default models will be unloaded to CPU memory after being used. This option keeps them in GPU memory.")
vram_group.add_argument("--normalvram", action="store_true", help="Used to force normal vram use if lowvram gets automatically enabled.")
vram_group.add_argument("--lowvram", action="store_false", help="Split the unet in parts to use less vram.")
vram_group.add_argument("--novram", action="store_false", help="When lowvram isn't enough.")
vram_group.add_argument("--cpu", action="store_true", help="To use the CPU for everything (slow).")
PC specs 12 GB GPU & 32GB Ram Windows 11
I am trying to run this on my environment with RX5500M 4GB + Ryzen 5 5600H + 24GB.
Initially I encountered both the Device type privateuseone is not supported for torch.Generator() api.
and DefaultCPUAllocator: not enough memory: you tried..
error ; but both these are fixed by the above solutions.
Now I am facing AssertionError: Torch not compiled with CUDA enabled
; can anyone please assist ?
W:\Fooocus_win64_2-1-791>python_embeded\python.exe -s Fooocus\entry_with_update.py --directml --lowvram
Already up-to-date
Update succeeded.
[System ARGV] ['Fooocus\\entry_with_update.py', '--directml', '--lowvram']
Python 3.10.9 (tags/v3.10.9:1dd9be6, Dec 6 2022, 20:01:21) [MSC v.1934 64 bit (AMD64)]
Fooocus version: 2.1.824
Running on local URL: http://127.0.0.1:7865
To create a public link, set `share=True` in `launch()`.
Using directml with device:
Total VRAM 4096 MB, total RAM 23906 MB
Set vram state to: LOW_VRAM
Disabling smart memory management
Device: privateuseone
VAE dtype: torch.float32
Using sub quadratic optimization for cross attention, if you have memory or speed issues try using: --use-split-cross-attention
Refiner unloaded.
model_type EPS
adm 2816
Using split attention in VAE
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
Using split attention in VAE
extra keys {'cond_stage_model.clip_g.transformer.text_model.embeddings.position_ids', 'cond_stage_model.clip_l.text_projection', 'cond_stage_model.clip_l.logit_scale'}
Base model loaded: W:\Fooocus_win64_2-1-791\Fooocus\models\checkpoints\juggernautXL_version6Rundiffusion.safetensors
Request to load LoRAs [['sd_xl_offset_example-lora_1.0.safetensors', 0.1], ['None', 1.0], ['None', 1.0], ['None', 1.0], ['None', 1.0]] for model [W:\Fooocus_win64_2-1-791\Fooocus\models\checkpoints\juggernautXL_version6Rundiffusion.safetensors].
Loaded LoRA [W:\Fooocus_win64_2-1-791\Fooocus\models\loras\sd_xl_offset_example-lora_1.0.safetensors] for UNet [W:\Fooocus_win64_2-1-791\Fooocus\models\checkpoints\juggernautXL_version6Rundiffusion.safetensors] with 788 keys at weight 0.1.
Fooocus V2 Expansion: Vocab with 642 words.
Fooocus Expansion engine loaded for cpu, use_fp16 = False.
Requested to load SDXLClipModel
Requested to load GPT2LMHeadModel
Loading 2 new models
App started successful. Use the app with http://127.0.0.1:7865/ or 127.0.0.1:7865
[Parameters] Adaptive CFG = 7
[Parameters] Sharpness = 2
[Parameters] ADM Scale = 1.5 : 0.8 : 0.3
[Parameters] CFG = 4.0
[Parameters] Seed = 8429543823708178985
[Parameters] Sampler = dpmpp_2m_sde_gpu - karras
[Parameters] Steps = 30 - 15
[Fooocus] Initializing ...
[Fooocus] Loading models ...
Refiner unloaded.
[Fooocus] Processing prompts ...
[Fooocus] Preparing Fooocus text #1 ...
[Prompt Expansion] boat in the sea, cinematic, dramatic ambient light, detailed, dynamic, full intricate, elegant, highly elaborate, colorful, vivid, breathtaking, sharp focus, fine detail, symmetry, clear, artistic, color, altered, epic, romantic, scenic, background, professional, enhanced, calm, joyful, unique, awesome, creative, positive, lucid, loving, beautiful
[Fooocus] Preparing Fooocus text #2 ...
[Prompt Expansion] boat in the sea, extremely detailed, magic, perfect, vibrant colors, dramatic, cinematic, artistic, complex, highly color balanced, enigmatic, sharp focus, open atmosphere, warm light, amazing composition, inspired, beautiful surreal, creative, positive, unique, joyful, very inspirational, inspiring, pure, thought, pristine, epic, hopeful, shiny, coherent, cute
[Fooocus] Encoding positive #1 ...
[Fooocus] Encoding positive #2 ...
[Fooocus] Encoding negative #1 ...
[Fooocus] Encoding negative #2 ...
[Parameters] Denoising Strength = 1.0
[Parameters] Initial Latent shape: Image Space (896, 1152)
Preparation time: 19.68 seconds
[Sampler] refiner_swap_method = joint
[Sampler] sigma_min = 0.0291671771556139, sigma_max = 14.614643096923828
Requested to load SDXL
Loading 1 new model
loading in lowvram mode 256.0
Traceback (most recent call last):
File "W:\Fooocus_win64_2-1-791\Fooocus\modules\async_worker.py", line 803, in worker
handler(task)
File "W:\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "W:\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "W:\Fooocus_win64_2-1-791\Fooocus\modules\async_worker.py", line 735, in handler
imgs = pipeline.process_diffusion(
File "W:\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "W:\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "W:\Fooocus_win64_2-1-791\Fooocus\modules\default_pipeline.py", line 361, in process_diffusion
sampled_latent = core.ksampler(
File "W:\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "W:\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "W:\Fooocus_win64_2-1-791\Fooocus\modules\core.py", line 315, in ksampler
samples = fcbh.sample.sample(model, noise, steps, cfg, sampler_name, scheduler, positive, negative, latent_image,
File "W:\Fooocus_win64_2-1-791\Fooocus\backend\headless\fcbh\sample.py", line 93, in sample
real_model, positive_copy, negative_copy, noise_mask, models = prepare_sampling(model, noise.shape, positive, negative, noise_mask)
File "W:\Fooocus_win64_2-1-791\Fooocus\backend\headless\fcbh\sample.py", line 86, in prepare_sampling
fcbh.model_management.load_models_gpu([model] + models, model.memory_required(noise_shape) + inference_memory)
File "W:\Fooocus_win64_2-1-791\Fooocus\modules\patch.py", line 494, in patched_load_models_gpu
y = fcbh.model_management.load_models_gpu_origin(*args, **kwargs)
File "W:\Fooocus_win64_2-1-791\Fooocus\backend\headless\fcbh\model_management.py", line 410, in load_models_gpu
cur_loaded_model = loaded_model.model_load(lowvram_model_memory)
File "W:\Fooocus_win64_2-1-791\Fooocus\backend\headless\fcbh\model_management.py", line 298, in model_load
accelerate.dispatch_model(self.real_model, device_map=device_map, main_device=self.device)
File "W:\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\accelerate\big_modeling.py", line 371, in dispatch_model
attach_align_device_hook_on_blocks(
File "W:\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\accelerate\hooks.py", line 536, in attach_align_device_hook_on_blocks
attach_align_device_hook_on_blocks(
File "W:\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\accelerate\hooks.py", line 536, in attach_align_device_hook_on_blocks
attach_align_device_hook_on_blocks(
File "W:\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\accelerate\hooks.py", line 506, in attach_align_device_hook_on_blocks
add_hook_to_module(module, hook)
File "W:\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\accelerate\hooks.py", line 155, in add_hook_to_module
module = hook.init_hook(module)
File "W:\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\accelerate\hooks.py", line 253, in init_hook
set_module_tensor_to_device(module, name, self.execution_device)
File "W:\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\accelerate\utils\modeling.py", line 292, in set_module_tensor_to_device
new_value = old_value.to(device)
File "W:\Fooocus_win64_2-1-791\python_embeded\lib\site-packages\torch\cuda\__init__.py", line 239, in _lazy_init
raise AssertionError("Torch not compiled with CUDA enabled")
AssertionError: Torch not compiled with CUDA enabled
Total time: 131.78 seconds```
@ljnath, try running run.bat instead.
@wcalvert , I tried the same. I face the same error message.
I'm also having issues with CUDA on AMD hardware.
Hardware:
Windows 10 Pro - 22H2 32GB RAM Ryzen 5800X Radeon 6700XT (12GB VRAM)
Have done the following fixes:
Changed brownian_interval.py - generator = torch.Generator().manual_seed(int(seed))
- Line 32
Changed model_management.py - mem_total = 8192 * 1024 * 1024
- line 95
Changed cli_args.py (See below)
vram_group = parser.add_mutually_exclusive_group() vram_group.add_argument("--gpu-only", action="store_true", help="Store and run everything (text encoders/CLIP models, etc... on the GPU).") vram_group.add_argument("--highvram", action="store_false", help="By default models will be unloaded to CPU memory after being used. This option keeps them in GPU memory.") vram_group.add_argument("--normalvram", action="store_true", help="Used to force normal vram use if lowvram gets automatically enabled.") vram_group.add_argument("--lowvram", action="store_false", help="Split the unet in parts to use less vram.") vram_group.add_argument("--novram", action="store_false", help="When lowvram isn't enough.") vram_group.add_argument("--cpu", action="store_true", help="To use the CPU for everything (slow).")
Still getting CUDA error when running run.bat
and I'm unsure why it's making a CUDA dependency call when AMD doesn't even use CUDA as it's only supported on Nvidia, we use ROCm...
If I apply this fix I get a warning that it is cpu rendering.
:\scratch\Fooocus_win64_2-1-791\Fooocus\modules\anisotropic.py:132: UserWarning: The operator 'aten::std_mean.correction' is not currently supported on the DML backend and will fall back to run on the CPU. This may have performance implications. (Triggered internally at D:\a\_work\1\s\pytorch-directml-plugin\torch_directml\csrc\dml\dml_cpu_fallback.cpp:17.)
s, m = torch.std_mean(g, dim=(1, 2, 3), keepdim=True)
Anybody having similiar issues?
Edit: I am running a rx6900xt and win11
I have also applied the fixes;
Changed brownian_interval.py - generator = torch.Generator().manual_seed(int(seed)) - Line 32
Changed model_management.py - mem_total = 8192 1024 1024 - line 95
Im still getting RuntimeError: Could not allocate tensor with 52428800 bytes. There is not enough GPU video memory available! Total time: 106.90 seconds
.
I have a 580 RX and win10. Does anyone know if this means that it's trying to allocate too much or if my hardware is not good enough? I don't have enough experience to know the difference. Thanks
I found a comment in another issue that is speculating the problem is due to a deeper code issue, memory allocation inside a loop, which is causing the out of memory problem:
https://github.com/lllyasviel/Fooocus/issues/1078#issuecomment-1835745769
I tried everything here too on my 64gb of RAM and 6700xt. It always says it is out of memory before it even gets started. Hoping this gets fixed because I was able to run it on my Mac and the results are lovely. But it's an hour per image.
I even get a blueenscreen of death named something like "error with gpu memory management".
I tried everything here too on my 64gb of RAM and 6700xt. It always says it is out of memory before it even gets started. Hoping this gets fixed because I was able to run it on my Mac and the results are lovely. But it's an hour per image.
I have exactly the same issue. I got a PC with 64gb of RAM and a 6600M. Same memory problem. I also was able to run it on my Mac but over 1 hour per image.
Following the advice from this comment #1278 , and reverting back to an older version, fixed the running out of memory runtime error while generating images for me. It's a workaround until a newer version fixes the problem.
Editing the torchsde file did not work.
Downloaded the program, pasted the models (2 checkpoint files and one inpaint) (already had them downloaded). Edited the run.bat file according to what it said under Windows(AMD GPUs)
Tried generating an image, and got an error. I'm pasting my entire CMD log.