Open user425846 opened 2 months ago
remove --always-gpu
will be faster
remove
--always-gpu
will be faster
Thanks for the quick reply, unfortunately that made it even worse, now taking over 50 seconds for the request to complete. Moving the models itself takes around 35 seconds now.
Sorry. I did more testing based on the information you provided, although not very sure, but I think it may have been due to insufficient available GPU memory. Due to equipment problems, I can only do limited testing here. If you can, can you please test how well Fooocus-API runs when it has exclusive GPUs
Can you clarify what you mean by exclusive GPUs?
I want to mention also that after testing, the same issue happens with the regular Fooocus too, not just with using your API.
Can you clarify what you mean by exclusive GPUs?
It means do not run any other programs that occupy GPU memory。
Can you clarify what you mean by exclusive GPUs?
It means do not run any other programs that occupy GPU memory。
Got it, yes im not doing that. I run only Fooocus on a RTX A6000 Ada with 48GB of vram
Okay so as far as i understand it, when patching the model for inpainting, it creates a clone. Clones are always unloaded afterwards. Even if i remove the code for unloading the clones, when it looks for already loaded models, the new clone doesnt exactly match the old one, so it is reloaded regardless.
Any idea how to fix this?
Okay so as far as i understand it, when patching the model for inpainting, it creates a clone. Clones are always unloaded afterwards. Even if i remove the code for unloading the clones, when it looks for already loaded models, the new clone doesnt exactly match the old one, so it is reloaded regardless.
Any idea how to fix this?
It seems that you have studied the Fooocus code more deeply than I have, and I have seen #2811
I will review the code and try fix it
Thanks for looking into this, im not too fluent in python unfortunately. C# & Dart/Flutter are my main languages. If i can test anything or if you have anything specific i can look into, let me know, im available.
I did some more testing.
Changing the code i mentioned in 2811 to this quick workaround:
for index, item in enumerate(current_loaded_models): if item.model == loaded_model.model: print("DEBUG: Found Model at index " + str(index)+": " + item.model.__class__.__name__) break; else: index = -1 if index != -1: current_loaded_models.insert(0, current_loaded_models.pop(index)) models_already_loaded.append(loaded_model) else: if hasattr(x, "model"): print(f"Requested to load {x.model.__class__.__name__}") models_to_load.append(loaded_model)
works, it finds the model. HOWEVER: The model is still a clone, so it is being unloaded. I can prevent the unloading of clones by just commenting out the unloading in the unload_model_clones method, then it actually works, it doesnt reload the model. Also the new clone is still being created, which fills up the vram.
BUT: then i get some artifacting in the image. I have attached an image without this modification and with the modification. It almost seems like somehow the refiner is not run for the last few steps but im not 100% sure.
Find this issue with google. Any update @mrhan1993 @user425846 ? Willing to pay you for fixing
I dont have a solution yet, maybe mrhan has one
I dont have a solution yet, maybe mrhan has one
I am trying, but not solve yet
Let me know if i can do anything to help, im very eager to fix this
@mrhan1993 any update ?
@mrhan1993 any update ?
Haha i think we are stressing him too much :D
@mrhan1993 any update ?
Haha i think we are stressing him too much :D
yes 😅
I carefully tracked the whole generation process, and I'm not sure if this is the cause of the problem.
That is, the SDXL model is always unloaded and then reloaded, while other models just go from one list to another. The reason for reloading is that the line if loaded_model in current_loaded_models:
returns False when checking whether SDXL is in the loaded list.
I compared its changes in processing, and there are some differences between SDXL in current_loaded_ models
and SDXL in current_loaded_ models
, as shown in the following figure.
I have already tried the same seed, it's not work
I will continue to explore this question, but I am not sure if I can finally find out what the problem is.
Willing to pay you 300$ btc for fixing @mrhan1993
One likely pessimistic expectation is that this may not be a BUG.
During the processing of Inipaint, information is added to the SDXL model, that is, the model_ options
attribute, which is likely to be different from time to time. I'm sorry, but with my ability, it's the best I can do. Maybe lllyasviel has a deeper understanding of this.
Hi @mrhan1993
@mashb1t found the solution. It is just a single line that needs to be changed, you can find it in https://github.com/lllyasviel/Fooocus/issues/2811
Line 362 in async_worker.py
use_synthetic_refiner = False
instead of use_synthetic_refiner = True
The issue is, that while this works on a clean install of Fooocus, it does not work with your API, if i edit the line in Fooocus-API/repositories/Fooocus/modules/async_worker.py it is not working.
Maybe with this information, you would be able to fix this and maybe even add it as a setting in config.txt or as a request parameter? As there are some things that do not work with this change:
This will limit you if you have the need to use refiners as well as when using face swap / other IP adapters, but if you don't need this you're good to go!
@user425846 Try this in fooocusapi/worker.py
, I will be confirm later
Just tried it out, setting use_synthetic_refiner = False
in fooocusapi/worker.py
, Line 432 did not fix the reloading. It correctly logs Refiner unloaded.
, like it happens in the clean Fooocus install, but the model is still reloaded anyway. Also tried setting it to false in both, fooocusapi/worker.py
AND Fooocus-API/repositories/Fooocus/modules/async_worker.py
Just tried it out, setting
use_synthetic_refiner = False
infooocusapi/worker.py
, Line 432 did not fix the reloading. It correctly logsRefiner unloaded.
, like it happens in the clean Fooocus install, but the model is still reloaded anyway. Also tried setting it to false in both,fooocusapi/worker.py
ANDFooocus-API/repositories/Fooocus/modules/async_worker.py
This is really bad news :joy:
Is it a bigger problem than it seems like? Because the original solution for the standard Fooocus install is very simple, just this one small change.
😂, This is the problem. worker.py
and async_worker.py
are almost exactly the same, but now one works properly and the other does not work.
I do a standard install for Fooocus, and change async_worker.py
line 362 to use_synthetic_refiner = False
, startup use python launch.py --listen 0.0.0.0 --port 7865 --always-gpu --disable-offload-from-vram
and then execute inpaint twice with the default parameters and background prompt
Here are my run log and Fooocus page parameters.
Can you provide your success log and Fooocus page options for success and all the other things you can think of?
(fooocus) PS D:\Fooocus> python launch.py --listen 0.0.0.0 --port 7865 --always-gpu --disable-offload-from-vram
[System ARGV] ['launch.py', '--listen', '0.0.0.0', '--port', '7865', '--always-gpu', '--disable-offload-from-vram']
Python 3.10.10 | packaged by Anaconda, Inc. | (main, Mar 21 2023, 18:39:17) [MSC v.1916 64 bit (AMD64)]
Fooocus version: 2.3.1
[Cleanup] Attempting to delete content of temp dir C:\Users\mrhan1993\AppData\Local\Temp\fooocus
[Cleanup] Cleanup successful
Total VRAM 24564 MB, total RAM 65298 MB
Set vram state to: HIGH_VRAM
Device: cuda:0 NVIDIA GeForce RTX 4090 : native
VAE dtype: torch.bfloat16
Using pytorch cross attention
2024-05-06 10:51:24.665467: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn
them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
WARNING:tensorflow:From C:\Users\mrhan1993\AppData\Roaming\Python\Python310\site-packages\keras\src\losses.py:2976: The name tf.losses.sparse_softmax_cross_entropy is deprecated. Please use tf.compat.v1.losses.sparse_softmax_cross_entropy instead.
[System ARGV] ['launch.py', '--listen', '0.0.0.0', '--port', '7865', '--always-gpu', '--disable-offload-from-vram']
Python 3.10.10 | packaged by Anaconda, Inc. | (main, Mar 21 2023, 18:39:17) [MSC v.1916 64 bit (AMD64)]
Fooocus version: 2.3.1
Refiner unloaded.
[Cleanup] Attempting to delete content of temp dir C:\Users\mrhan1993\AppData\Local\Temp\fooocus
[Cleanup] Cleanup successful
Running on local URL: http://0.0.0.0:7865
model_type EPS
UNet ADM Dimension 2816
Using pytorch attention in VAE
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
Using pytorch attention in VAE
extra {'cond_stage_model.clip_l.logit_scale', 'cond_stage_model.clip_l.text_projection', 'cond_stage_model.clip_g.transformer.text_model.embeddings.position_ids'}
loaded straight to GPU
Requested to load SDXL
Loading 1 new model
Base model loaded: D:\AI\Fooocus\models\checkpoints\juggernautXL_v8Rundiffusion.safetensors
Request to load LoRAs [['sd_xl_offset_example-lora_1.0.safetensors', 0.1], ['None', 1.0], ['None', 1.0], ['None', 1.0], ['None', 1.0]] for model [D:\AI\Fooocus\models\checkpoints\juggernautXL_v8Rundiffusion.safetensors].
Loaded LoRA [D:\AI\Fooocus\models\loras\sd_xl_offset_example-lora_1.0.safetensors] for UNet [D:\AI\Fooocus\models\checkpoints\juggernautXL_v8Rundiffusion.safetensors] with 788 keys at weight 0.1.
Fooocus V2 Expansion: Vocab with 642 words.
To create a public link, set `share=True` in `launch()`.
Fooocus Expansion engine loaded for cuda:0, use_fp16 = True.
Requested to load SDXLClipModel
Requested to load GPT2LMHeadModel
Loading 2 new models
[Fooocus Model Management] Moving model(s) has taken 0.50 seconds
Started worker with PID 15980
App started successful. Use the app with http://localhost:7865/ or 0.0.0.0:7865
[Parameters] Adaptive CFG = 7
[Parameters] Sharpness = 2
[Parameters] ControlNet Softness = 0.25
[Parameters] ADM Scale = 1.5 : 0.8 : 0.3
[Parameters] CFG = 4.0
[Parameters] Seed = 5422324584526021678
[Fooocus] Downloading upscale models ...
[Fooocus] Downloading inpainter ...
[Inpaint] Current inpaint model is D:\AI\Fooocus\models\inpaint\inpaint_v26.fooocus.patch
[Parameters] Sampler = dpmpp_2m_sde_gpu - karras
[Parameters] Steps = 30 - 24
[Fooocus] Initializing ...
[Fooocus] Loading models ...
Refiner unloaded.
Request to load LoRAs [['sd_xl_offset_example-lora_1.0.safetensors', 0.1], ['None', 1.0], ['None', 1.0], ['None', 1.0], ['None', 1.0], ('D:\\AI\\Fooocus\\models\\inpaint\\inpaint_v26.fooocus.patch', 1.0)] for model [D:\AI\Fooocus\models\checkpoints\juggernautXL_v8Rundiffusion.safetensors].
Loaded LoRA [D:\AI\Fooocus\models\loras\sd_xl_offset_example-lora_1.0.safetensors] for UNet [D:\AI\Fooocus\models\checkpoints\juggernautXL_v8Rundiffusion.safetensors] with 788 keys at weight 0.1.
Loaded LoRA [D:\AI\Fooocus\models\inpaint\inpaint_v26.fooocus.patch] for UNet [D:\AI\Fooocus\models\checkpoints\juggernautXL_v8Rundiffusion.safetensors] with 960 keys at weight 1.0.
Requested to load SDXLClipModel
Loading 1 new model
unload clone 1
[Fooocus] Processing prompts ...
[Fooocus] Preparing Fooocus text #1 ...
[Prompt Expansion] background, light translucent, transparent, full, detailed background, intricate, elegant, highly contrasted, dramatic, sharp focus, inspired, beautiful, aesthetic, innocent, fine detail, professional composition, color spread, artistic, enhanced, lush, fancy, cute, perfect, elaborate, iconic, best, ambient, fresh, modern, futuristic, trendy, creative, cool, awesome
[Fooocus] Encoding positive #1 ...
[Fooocus] Encoding negative #1 ...
[Fooocus] Image processing ...
Upscaling image with shape (837, 837, 3) ...
[Fooocus] VAE Inpaint encoding ...
Requested to load AutoencoderKL
Loading 1 new model
[Fooocus] VAE encoding ...
Final resolution is (1024, 1024), latent is (1024, 1024).
[Parameters] Denoising Strength = 1
[Parameters] Initial Latent shape: torch.Size([1, 4, 128, 128])
Preparation time: 5.62 seconds
[Sampler] refiner_swap_method = joint
[Sampler] sigma_min = 0.0291671771556139, sigma_max = 14.614643096923828
Requested to load SDXL
Loading 1 new model
unload clone 3
[Fooocus Model Management] Moving model(s) has taken 0.83 seconds
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 30/30 [00:04<00:00, 6.49it/s]
Image generated with private log at: D:\AI\Fooocus\outputs\2024-05-06\log.html
Generating and saving time: 5.99 seconds
Total time: 11.65 seconds
[Parameters] Adaptive CFG = 7
[Parameters] Sharpness = 2
[Parameters] ControlNet Softness = 0.25
[Parameters] ADM Scale = 1.5 : 0.8 : 0.3
[Parameters] CFG = 4.0
[Parameters] Seed = 1529338463900703148
[Fooocus] Downloading upscale models ...
[Fooocus] Downloading inpainter ...
[Inpaint] Current inpaint model is D:\AI\Fooocus\models\inpaint\inpaint_v26.fooocus.patch
[Parameters] Sampler = dpmpp_2m_sde_gpu - karras
[Parameters] Steps = 30 - 24
[Fooocus] Initializing ...
[Fooocus] Loading models ...
Refiner unloaded.
[Fooocus] Processing prompts ...
[Fooocus] Preparing Fooocus text #1 ...
[Prompt Expansion] background, beautiful, highly detailed, dramatic light, sharp focus, intricate, elegant, dynamic, vibrant colors, open background, professional, fine detail, cinematic, enhanced, mystical, iconic, best, creative, quiet, unique, cute, friendly, charming, pretty, attractive, cool, elite, color guarded, extremely, lush, inspired, clear, artistic, positive
[Fooocus] Encoding positive #1 ...
[Fooocus] Encoding negative #1 ...
[Fooocus] Image processing ...
Upscaling image with shape (837, 837, 3) ...
[Fooocus] VAE Inpaint encoding ...
[Fooocus] VAE encoding ...
Final resolution is (1024, 1024), latent is (1024, 1024).
[Parameters] Denoising Strength = 1
[Parameters] Initial Latent shape: torch.Size([1, 4, 128, 128])
Preparation time: 4.22 seconds
[Sampler] refiner_swap_method = joint
[Sampler] sigma_min = 0.0291671771556139, sigma_max = 14.614643096923828
Requested to load SDXL
Loading 1 new model
unload clone 3
[Fooocus Model Management] Moving model(s) has taken 0.83 seconds
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 30/30 [00:04<00:00, 6.47it/s]
Image generated with private log at: D:\AI\Fooocus\outputs\2024-05-06\log.html
Generating and saving time: 5.99 seconds
Total time: 10.24 seconds
You wont believe me but im unable to reproduce my success from yesterday. Setting it to false does not fix it anymore. Man, i didnt expect this problem to be so much work hahaha
I have tried it multiple times now on fresh installations and i cant get it to work anymore. Im lost right now.
I have tried it multiple times now on fresh installations and i cant get it to work anymore. Im lost right now.
:joy: :joy: , may be your success is txt2img
I have tried it multiple times now on fresh installations and i cant get it to work anymore. Im lost right now.
@user425846 I'm sorry and also clueless, can't support / debug any further as it works on my machine(s), both Macbook and Windows... Can you please confirm it still creating clones when using the adjusted code in Colab?
I have tried it multiple times now on fresh installations and i cant get it to work anymore. Im lost right now.
@user425846 I'm sorry and also clueless, can't support / debug any further as it works on my machine(s), both Macbook and Windows... Can you please confirm it still creating clones when using the adjusted code in Colab?
I have tested it again right now and i "solved" it by choosing a faster server to run it on, so it is much quicker to load the model. I have attached the full log below, you can see that it is still unloading clones and moving models at every request. You can also see it correctly isnt using any refiner and even logs Refiner unloaded.
[Parameters] Adaptive CFG = 7 [Parameters] Sharpness = 2 [Parameters] ControlNet Softness = 0.25 [Parameters] ADM Scale = 1.5 : 0.8 : 0.3 [Parameters] CFG = 4.0 [Parameters] Seed = 3081818957029639826 [Fooocus] Downloading upscale models ... [Fooocus] Downloading inpainter ... [Inpaint] Current inpaint model is /workspace/Fooocus/models/inpaint/inpaint_v26.fooocus.patch [Parameters] Sampler = dpmpp_2m_sde_gpu - karras [Parameters] Steps = 30 - 24 [Fooocus] Initializing ... [Fooocus] Loading models ... Refiner unloaded. Request to load LoRAs [['sd_xl_offset_example-lora_1.0.safetensors', 0.1], ['None', 1.0], ['None', 1.0], ['None', 1.0], ['None', 1.0], ('/workspace/Fooocus/models/inpaint/inpaint_v26.fooocus.patch', 1.0)] for model [/workspace/Fooocus/models/checkpoints/juggernautXL_v8Rundiffusion.safetensors]. Loaded LoRA [/workspace/Fooocus/models/loras/sd_xl_offset_example-lora_1.0.safetensors] for UNet [/workspace/Fooocus/models/checkpoints/juggernautXL_v8Rundiffusion.safetensors] with 788 keys at weight 0.1. Loaded LoRA [/workspace/Fooocus/models/inpaint/inpaint_v26.fooocus.patch] for UNet [/workspace/Fooocus/models/checkpoints/juggernautXL_v8Rundiffusion.safetensors] with 960 keys at weight 1.0. Requested to load SDXLClipModel Loading 1 new model unload clone 1 [Fooocus Model Management] Moving model(s) has taken 0.53 seconds [Fooocus] Processing prompts ... [Fooocus] Preparing Fooocus text #1 ... [Prompt Expansion] plane, highly detailed, intricate, sharp focus, beautiful, symmetry, cinematic, fine composition, cool color, ambient light, dynamic background, cute, iconic, deep aesthetic, innocent, alive, pure, full detail, inspired, designed, rich clear professional, artistic, fancy, creative, fair, passionate, amazing, inspiring, marvelous, brilliant, epic, thought, monumental [Fooocus] Encoding positive #1 ... [Fooocus] Encoding negative #1 ... [Fooocus] Image processing ... Upscaling image with shape (666, 789, 3) ... [Fooocus] VAE Inpaint encoding ... Requested to load AutoencoderKL Loading 1 new model [Fooocus] VAE encoding ... Final resolution is (1024, 1024), latent is (896, 1088). [Parameters] Denoising Strength = 1 [Parameters] Initial Latent shape: Image Space (896, 1088) Preparation time: 8.75 seconds [Sampler] refiner_swap_method = joint [Sampler] sigma_min = 0.0291671771556139, sigma_max = 14.614643096923828 Requested to load SDXL Loading 1 new model unload clone 3 [Fooocus Model Management] Moving model(s) has taken 0.92 seconds 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 30/30 [00:04<00:00, 7.11it/s] Image generated with private log at: /workspace/Fooocus/outputs/2024-05-06/log.html Generating and saving time: 5.72 seconds Total time: 34.90 seconds [Parameters] Adaptive CFG = 7 [Parameters] Sharpness = 2 [Parameters] ControlNet Softness = 0.25 [Parameters] ADM Scale = 1.5 : 0.8 : 0.3 [Parameters] CFG = 4.0 [Parameters] Seed = 665115900573930083 [Fooocus] Downloading upscale models ... [Fooocus] Downloading inpainter ... [Inpaint] Current inpaint model is /workspace/Fooocus/models/inpaint/inpaint_v26.fooocus.patch [Parameters] Sampler = dpmpp_2m_sde_gpu - karras [Parameters] Steps = 30 - 24 [Fooocus] Initializing ... [Fooocus] Loading models ... Refiner unloaded. [Fooocus] Processing prompts ... [Fooocus] Preparing Fooocus text #1 ... [Prompt Expansion] plane, full perfect, vivid colors, extremely detailed, beautiful, cinematic, stunning, light, gorgeous, intricate detail, very inspirational, original composition, ambient created, clear, elegant, artistic, sharp focus, highly thought focused, professional still, amazing quality, attractive, bright background, dynamic, fine, trendy, best, open, new, sleek, futuristic, color [Fooocus] Encoding positive #1 ... [Fooocus] Encoding negative #1 ... [Fooocus] Image processing ... Upscaling image with shape (666, 789, 3) ... [Fooocus] VAE Inpaint encoding ... [Fooocus] VAE encoding ... Final resolution is (1024, 1024), latent is (896, 1088). [Parameters] Denoising Strength = 1 [Parameters] Initial Latent shape: Image Space (896, 1088) Preparation time: 3.45 seconds [Sampler] refiner_swap_method = joint [Sampler] sigma_min = 0.0291671771556139, sigma_max = 14.614643096923828 Requested to load SDXL Loading 1 new model unload clone 3 [Fooocus Model Management] Moving model(s) has taken 0.92 seconds 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 30/30 [00:04<00:00, 7.21it/s] Image generated with private log at: /workspace/Fooocus/outputs/2024-05-06/log.html Generating and saving time: 5.63 seconds Total time: 9.11 seconds
Hi,
despite using the arguments, every time when an inpaint request is started, it is moving models. This takes over 10 seconds every time. I have tried disabling all loras and styles, but that didnt change anything. Any ideas? This is not usable like that for me unfortunately. Below is the log for launching and doing 2x identical inpainting requests back to back.