lllyasviel / Fooocus

Focus on prompting and generating
GNU General Public License v3.0
41.42k stars 5.87k forks source link

[Bug]: Black screen on many outputs anywhere from 8 to 50+ steps #2747

Closed AFOLcast closed 6 months ago

AFOLcast commented 7 months ago

Checklist

What happened?

I get a Black screen on many outputs anywhere from 8 to 50+ steps. Whether I'm varying or creating new images. I've tried reducing loras and so forth. Happens with or without a refiner. Here's the current run:

D:\Fooocus>.\python_embeded\python.exe -s Fooocus\entry_with_update.py --always-download-new-model --preset realistic Already up-to-date Update succeeded. [System ARGV] ['Fooocus\entry_with_update.py', '--always-download-new-model', '--preset', 'realistic'] Python 3.10.9 (tags/v3.10.9:1dd9be6, Dec 6 2022, 20:01:21) [MSC v.1934 64 bit (AMD64)] Fooocus version: 2.3.1 Loaded preset: D:\Fooocus\Fooocus\presets\realistic.json [Cleanup] Attempting to delete content of temp dir C:\Users\jmkna\AppData\Local\Temp\fooocus [Cleanup] Cleanup successful Total VRAM 6144 MB, total RAM 16200 MB xformers version: 0.0.20 Set vram state to: NORMAL_VRAM Always offload VRAM Device: cuda:0 NVIDIA GeForce RTX 2060 : native VAE dtype: torch.float32 Using xformers cross attention Refiner unloaded. Running on local URL: http://127.0.0.1:7865

To create a public link, set share=True in launch(). model_type EPS UNet ADM Dimension 2816 Using xformers attention in VAE Working with z of shape (1, 4, 32, 32) = 4096 dimensions. Using xformers attention in VAE extra {'cond_stage_model.clip_l.text_projection', 'cond_stage_model.clip_l.logit_scale', 'cond_stage_model.clip_g.transformer.text_model.embeddings.position_ids'} Base model loaded: D:\Fooocus\Fooocus\models\checkpoints\realisticStockPhoto_v20.safetensors Request to load LoRAs [['SDXL_FILM_PHOTOGRAPHY_STYLE_BetaV0.4.safetensors', 0.25], ['None', 1.0], ['None', 1.0], ['None', 1.0], ['None', 1.0]] for model [D:\Fooocus\Fooocus\models\checkpoints\realisticStockPhoto_v20.safetensors]. Loaded LoRA [D:\Fooocus\Fooocus\models\loras\SDXL_FILM_PHOTOGRAPHY_STYLE_BetaV0.4.safetensors] for UNet [D:\Fooocus\Fooocus\models\checkpoints\realisticStockPhoto_v20.safetensors] with 788 keys at weight 0.25. Loaded LoRA [D:\Fooocus\Fooocus\models\loras\SDXL_FILM_PHOTOGRAPHY_STYLE_BetaV0.4.safetensors] for CLIP [D:\Fooocus\Fooocus\models\checkpoints\realisticStockPhoto_v20.safetensors] with 264 keys at weight 0.25. Fooocus V2 Expansion: Vocab with 642 words. Fooocus Expansion engine loaded for cuda:0, use_fp16 = True. Requested to load SDXLClipModel Requested to load GPT2LMHeadModel Loading 2 new models [Fooocus Model Management] Moving model(s) has taken 4.22 seconds Started worker with PID 12384 App started successful. Use the app with http://127.0.0.1:7865/ or 127.0.0.1:7865 [Parameters] Adaptive CFG = 7 [Parameters] Sharpness = 5 [Parameters] ControlNet Softness = 0.25 [Parameters] ADM Scale = 1.5 : 0.8 : 0.3 [Parameters] CFG = 3.0 [Parameters] Seed = 3664301469950835785 [Parameters] Sampler = dpmpp_2m_sde_gpu - karras [Parameters] Steps = 60 - 48 [Fooocus] Initializing ... [Fooocus] Loading models ... model_type EPS UNet ADM Dimension 2816 Using xformers attention in VAE Working with z of shape (1, 4, 32, 32) = 4096 dimensions. Using xformers attention in VAE extra {'cond_stage_model.clip_l.text_projection', 'cond_stage_model.clip_l.logit_scale', 'cond_stage_model.clip_g.transformer.text_model.embeddings.position_ids'} Refiner model loaded: D:\Fooocus\Fooocus\models\checkpoints\juggernautXL_v8Rundiffusion.safetensors Request to load LoRAs [['SDXL_FILM_PHOTOGRAPHY_STYLE_BetaV0.4.safetensors', 0.25], ['add-detail-xl.safetensors', 0.5], ['texta.safetensors', 1.0], ['BetterTextRedmond.safetensors', 1.0], ['None', 1.0]] for model [D:\Fooocus\Fooocus\models\checkpoints\realisticStockPhoto_v20.safetensors]. Loaded LoRA [D:\Fooocus\Fooocus\models\loras\SDXL_FILM_PHOTOGRAPHY_STYLE_BetaV0.4.safetensors] for UNet [D:\Fooocus\Fooocus\models\checkpoints\realisticStockPhoto_v20.safetensors] with 788 keys at weight 0.25. Loaded LoRA [D:\Fooocus\Fooocus\models\loras\SDXL_FILM_PHOTOGRAPHY_STYLE_BetaV0.4.safetensors] for CLIP [D:\Fooocus\Fooocus\models\checkpoints\realisticStockPhoto_v20.safetensors] with 264 keys at weight 0.25. Loaded LoRA [D:\Fooocus\Fooocus\models\loras\add-detail-xl.safetensors] for UNet [D:\Fooocus\Fooocus\models\checkpoints\realisticStockPhoto_v20.safetensors] with 722 keys at weight 0.5. Loaded LoRA [D:\Fooocus\Fooocus\models\loras\add-detail-xl.safetensors] for CLIP [D:\Fooocus\Fooocus\models\checkpoints\realisticStockPhoto_v20.safetensors] with 264 keys at weight 0.5. Loaded LoRA [D:\Fooocus\Fooocus\models\loras\texta.safetensors] for UNet [D:\Fooocus\Fooocus\models\checkpoints\realisticStockPhoto_v20.safetensors] with 788 keys at weight 1.0. Loaded LoRA [D:\Fooocus\Fooocus\models\loras\BetterTextRedmond.safetensors] for UNet [D:\Fooocus\Fooocus\models\checkpoints\realisticStockPhoto_v20.safetensors] with 722 keys at weight 1.0. Request to load LoRAs [['SDXL_FILM_PHOTOGRAPHY_STYLE_BetaV0.4.safetensors', 0.25], ['add-detail-xl.safetensors', 0.5], ['texta.safetensors', 1.0], ['BetterTextRedmond.safetensors', 1.0], ['None', 1.0]] for model [D:\Fooocus\Fooocus\models\checkpoints\juggernautXL_v8Rundiffusion.safetensors]. Loaded LoRA [D:\Fooocus\Fooocus\models\loras\SDXL_FILM_PHOTOGRAPHY_STYLE_BetaV0.4.safetensors] for UNet [D:\Fooocus\Fooocus\models\checkpoints\juggernautXL_v8Rundiffusion.safetensors] with 788 keys at weight 0.25. Loaded LoRA [D:\Fooocus\Fooocus\models\loras\add-detail-xl.safetensors] for UNet [D:\Fooocus\Fooocus\models\checkpoints\juggernautXL_v8Rundiffusion.safetensors] with 722 keys at weight 0.5. Loaded LoRA [D:\Fooocus\Fooocus\models\loras\texta.safetensors] for UNet [D:\Fooocus\Fooocus\models\checkpoints\juggernautXL_v8Rundiffusion.safetensors] with 788 keys at weight 1.0. Loaded LoRA [D:\Fooocus\Fooocus\models\loras\BetterTextRedmond.safetensors] for UNet [D:\Fooocus\Fooocus\models\checkpoints\juggernautXL_v8Rundiffusion.safetensors] with 722 keys at weight 1.0. Requested to load SDXLClipModel Loading 1 new model unload clone 1 [Fooocus Model Management] Moving model(s) has taken 5.20 seconds [Fooocus] Processing prompts ... [Fooocus] Preparing Fooocus text #1 ... [Prompt Expansion] create a hyper realistic photo of a male model in a suit on a busy Wall Street sidewalk holding a hand-written sign that says "THIS IS NOT REAL." Natural, spontaneous facial expression. natural pose, sharp focus, detailed, warm colors, inspired, illustrious, fine detail, elaborate great intricate, beautiful, elegant, amazing composition, cinematic, delicate, epic, cool [Fooocus] Encoding positive #1 ... [Fooocus] Encoding negative #1 ... [Fooocus] Image processing ... [Fooocus] VAE encoding ... Requested to load AutoencoderKL Loading 1 new model [Fooocus Model Management] Moving model(s) has taken 1.36 seconds Final resolution is (1024, 1024). [Parameters] Denoising Strength = 0.85 [Parameters] Initial Latent shape: torch.Size([1, 4, 128, 128]) Preparation time: 187.24 seconds [Sampler] refiner_swap_method = joint [Sampler] sigma_min = 0.0291671771556139, sigma_max = 7.829630374908447 Requested to load SDXL Loading 1 new model loading in lowvram mode 3107.6340894699097 [Fooocus Model Management] Moving model(s) has taken 68.70 seconds 0%| | 0/60 [00:04<?, ?it/s] User stopped Requested to load SDXLClipModel Requested to load GPT2LMHeadModel Loading 2 new models Total time: 320.88 seconds [Fooocus Model Management] Moving model(s) has taken 9.10 seconds [Parameters] Adaptive CFG = 7 [Parameters] Sharpness = 5 [Parameters] ControlNet Softness = 0.25 [Parameters] ADM Scale = 1.5 : 0.8 : 0.3 [Parameters] CFG = 3.0 [Parameters] Seed = 3664301469950835785 [Parameters] Sampler = dpmpp_2m_sde_gpu - karras [Parameters] Steps = 60 - 48 [Fooocus] Initializing ... [Fooocus] Loading models ... [Fooocus] Processing prompts ... [Fooocus] Preparing Fooocus text #1 ... [Prompt Expansion] create a hyperrealistic tilt-shift photo of a businessman in his mid-40s in a suit on a busy Wall Street sidewalk holding a hand-written sign that says "THIS IS NOT REAL." Natural spontaneous facial expression. natural active pose, full color, detailed, highly educated, extremely beautiful, inspiring, thought, dramatic cinematic light, attractive, best, perfect [Fooocus] Encoding positive #1 ... [Fooocus] Encoding negative #1 ... [Fooocus] Image processing ... [Fooocus] VAE encoding ... Requested to load AutoencoderKL Loading 1 new model [Fooocus Model Management] Moving model(s) has taken 0.81 seconds Final resolution is (1024, 1024). [Parameters] Denoising Strength = 0.85 [Parameters] Initial Latent shape: torch.Size([1, 4, 128, 128]) Preparation time: 3.96 seconds [Sampler] refiner_swap_method = joint [Sampler] sigma_min = 0.0291671771556139, sigma_max = 7.829630374908447 Requested to load SDXL Loading 1 new model loading in lowvram mode 3105.710636138916 [Fooocus Model Management] Moving model(s) has taken 61.83 seconds 80%|████████████████████████████████████████████████████████████████████████████████████████████████████ | 48/60 [01:10<00:16, 1.39s/it]Requested to load SDXL Loading 1 new model loading in lowvram mode 3094.1680431365967 [Fooocus Model Management] Moving model(s) has taken 54.45 seconds Refiner Swapped 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 60/60 [02:23<00:00, 2.39s/it] Requested to load AutoencoderKL Loading 1 new model [Fooocus Model Management] Moving model(s) has taken 1.71 seconds Image generated with private log at: D:\Fooocus\Fooocus\outputs\2024-04-11\log.html Generating and saving time: 212.76 seconds Requested to load SDXLClipModel Requested to load GPT2LMHeadModel Loading 2 new models Total time: 220.58 seconds [Fooocus Model Management] Moving model(s) has taken 10.65 seconds [Parameters] Adaptive CFG = 7 [Parameters] Sharpness = 5 [Parameters] ControlNet Softness = 0.25 [Parameters] ADM Scale = 1.5 : 0.8 : 0.3 [Parameters] CFG = 3.0 [Parameters] Seed = 3664301469950835785 [Parameters] Sampler = dpmpp_2m_sde_gpu - karras [Parameters] Steps = 60 - 48 [Fooocus] Initializing ... [Fooocus] Loading models ... [Fooocus] Processing prompts ... [Fooocus] Preparing Fooocus text #1 ... [Prompt Expansion] create a hyperrealistic tilt-shift photo of a businessman in his mid-40s in a suit on a busy Wall Street sidewalk holding a hand-written sign that says "THIS IS NOT REAL." Natural spontaneous facial expression. natural active pose. High contrast. Dark cinematic lighting, great colors, vivid, beautiful, intricate, elegant, highly detailed, complex, very sharp [Fooocus] Preparing Fooocus text #2 ... [Prompt Expansion] create a hyperrealistic tilt-shift photo of a businessman in his mid-40s in a suit on a busy Wall Street sidewalk holding a hand-written sign that says "THIS IS NOT REAL." Natural spontaneous facial expression. natural active pose. High contrast. Dark cinematic lighting, great composition, colors, intricate, elegant, light shining, highly detailed, amazing quality [Fooocus] Preparing Fooocus text #3 ... [Prompt Expansion] create a hyperrealistic tilt-shift photo of a businessman in his mid-40s in a suit on a busy Wall Street sidewalk holding a hand-written sign that says "THIS IS NOT REAL." Natural spontaneous facial expression. natural active pose. High contrast. Dark cinematic lighting, great composition, thought, elegant, crisp detailed, very artistic, dynamic light, colors [Fooocus] Preparing Fooocus text #4 ... [Prompt Expansion] create a hyperrealistic tilt-shift photo of a businessman in his mid-40s in a suit on a busy Wall Street sidewalk holding a hand-written sign that says "THIS IS NOT REAL." Natural spontaneous facial expression. natural active pose. High contrast. Dark cinematic lighting, great composition, beautiful detailed, intricate, elegant, light shining, very coherent, cute [Fooocus] Encoding positive #1 ... [Fooocus] Encoding positive #2 ... [Fooocus] Encoding positive #3 ... [Fooocus] Encoding positive #4 ... [Fooocus] Encoding negative #1 ... [Fooocus] Encoding negative #2 ... [Fooocus] Encoding negative #3 ... [Fooocus] Encoding negative #4 ... [Fooocus] Image processing ... [Fooocus] VAE encoding ... Requested to load AutoencoderKL Loading 1 new model [Fooocus Model Management] Moving model(s) has taken 0.27 seconds Final resolution is (1024, 1024). [Parameters] Denoising Strength = 0.85 [Parameters] Initial Latent shape: torch.Size([1, 4, 128, 128]) Preparation time: 6.56 seconds [Sampler] refiner_swap_method = joint [Sampler] sigma_min = 0.0291671771556139, sigma_max = 7.829630374908447 Requested to load SDXL Loading 1 new model loading in lowvram mode 3095.1333379745483 [Fooocus Model Management] Moving model(s) has taken 46.25 seconds 80%|████████████████████████████████████████████████████████████████████████████████████████████████████ | 48/60 [01:02<00:14, 1.24s/it]Requested to load SDXL Loading 1 new model loading in lowvram mode 3085.1292066574097 [Fooocus Model Management] Moving model(s) has taken 39.43 seconds Refiner Swapped 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 60/60 [01:56<00:00, 1.94s/it] Requested to load AutoencoderKL Loading 1 new model [Fooocus Model Management] Moving model(s) has taken 1.67 seconds Image generated with private log at: D:\Fooocus\Fooocus\outputs\2024-04-11\log.html Generating and saving time: 171.17 seconds [Sampler] refiner_swap_method = joint [Sampler] sigma_min = 0.0291671771556139, sigma_max = 7.829630374908447 Requested to load SDXL Loading 1 new model loading in lowvram mode 3086.0948762893677 [Fooocus Model Management] Moving model(s) has taken 38.38 seconds 80%|████████████████████████████████████████████████████████████████████████████████████████████████████ | 48/60 [01:14<00:18, 1.52s/it]Requested to load SDXL Loading 1 new model loading in lowvram mode 3076.090744972229 [Fooocus Model Management] Moving model(s) has taken 38.16 seconds Refiner Swapped 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 60/60 [02:10<00:00, 2.18s/it] Requested to load AutoencoderKL Loading 1 new model [Fooocus Model Management] Moving model(s) has taken 1.18 seconds Image generated with private log at: D:\Fooocus\Fooocus\outputs\2024-04-11\log.html Generating and saving time: 174.48 seconds [Sampler] refiner_swap_method = joint [Sampler] sigma_min = 0.0291671771556139, sigma_max = 7.829630374908447 Requested to load SDXL Loading 1 new model loading in lowvram mode 3077.056414604187 [Fooocus Model Management] Moving model(s) has taken 37.84 seconds 57%|██████████████████████████████████████████████████████████████████████▊ | 34/60 [00:53<00:41, 1.58s/it] User skipped [Sampler] refiner_swap_method = joint [Sampler] sigma_min = 0.0291671771556139, sigma_max = 7.829630374908447 5%|██████▎ | 3/60 [00:05<01:53, 1.99s/it] User skipped Requested to load SDXLClipModel Requested to load GPT2LMHeadModel Loading 2 new models Total time: 450.54 seconds [Fooocus Model Management] Moving model(s) has taken 12.07 seconds [Parameters] Adaptive CFG = 7 [Parameters] Sharpness = 5 [Parameters] ControlNet Softness = 0.25 [Parameters] ADM Scale = 1.5 : 0.8 : 0.3 [Parameters] CFG = 3.0 [Parameters] Seed = 806013481475394767 [Parameters] Sampler = dpmpp_2m_sde_gpu - karras [Parameters] Steps = 60 - 48 [Fooocus] Initializing ... [Fooocus] Loading models ... Request to load LoRAs [['SDXL_FILM_PHOTOGRAPHY_STYLE_BetaV0.4.safetensors', 0.25], ['add-detail-xl.safetensors', 0.5], ['texta.safetensors', 1.0], ['BetterTextRedmond.safetensors', 0.5], ['None', 1.0]] for model [D:\Fooocus\Fooocus\models\checkpoints\realisticStockPhoto_v20.safetensors]. Loaded LoRA [D:\Fooocus\Fooocus\models\loras\SDXL_FILM_PHOTOGRAPHY_STYLE_BetaV0.4.safetensors] for UNet [D:\Fooocus\Fooocus\models\checkpoints\realisticStockPhoto_v20.safetensors] with 788 keys at weight 0.25. Loaded LoRA [D:\Fooocus\Fooocus\models\loras\SDXL_FILM_PHOTOGRAPHY_STYLE_BetaV0.4.safetensors] for CLIP [D:\Fooocus\Fooocus\models\checkpoints\realisticStockPhoto_v20.safetensors] with 264 keys at weight 0.25. Loaded LoRA [D:\Fooocus\Fooocus\models\loras\add-detail-xl.safetensors] for UNet [D:\Fooocus\Fooocus\models\checkpoints\realisticStockPhoto_v20.safetensors] with 722 keys at weight 0.5. Loaded LoRA [D:\Fooocus\Fooocus\models\loras\add-detail-xl.safetensors] for CLIP [D:\Fooocus\Fooocus\models\checkpoints\realisticStockPhoto_v20.safetensors] with 264 keys at weight 0.5. Loaded LoRA [D:\Fooocus\Fooocus\models\loras\texta.safetensors] for UNet [D:\Fooocus\Fooocus\models\checkpoints\realisticStockPhoto_v20.safetensors] with 788 keys at weight 1.0. Loaded LoRA [D:\Fooocus\Fooocus\models\loras\BetterTextRedmond.safetensors] for UNet [D:\Fooocus\Fooocus\models\checkpoints\realisticStockPhoto_v20.safetensors] with 722 keys at weight 0.5. Request to load LoRAs [['SDXL_FILM_PHOTOGRAPHY_STYLE_BetaV0.4.safetensors', 0.25], ['add-detail-xl.safetensors', 0.5], ['texta.safetensors', 1.0], ['BetterTextRedmond.safetensors', 0.5], ['None', 1.0]] for model [D:\Fooocus\Fooocus\models\checkpoints\juggernautXL_v8Rundiffusion.safetensors]. Loaded LoRA [D:\Fooocus\Fooocus\models\loras\SDXL_FILM_PHOTOGRAPHY_STYLE_BetaV0.4.safetensors] for UNet [D:\Fooocus\Fooocus\models\checkpoints\juggernautXL_v8Rundiffusion.safetensors] with 788 keys at weight 0.25. Loaded LoRA [D:\Fooocus\Fooocus\models\loras\add-detail-xl.safetensors] for UNet [D:\Fooocus\Fooocus\models\checkpoints\juggernautXL_v8Rundiffusion.safetensors] with 722 keys at weight 0.5. Loaded LoRA [D:\Fooocus\Fooocus\models\loras\texta.safetensors] for UNet [D:\Fooocus\Fooocus\models\checkpoints\juggernautXL_v8Rundiffusion.safetensors] with 788 keys at weight 1.0. Loaded LoRA [D:\Fooocus\Fooocus\models\loras\BetterTextRedmond.safetensors] for UNet [D:\Fooocus\Fooocus\models\checkpoints\juggernautXL_v8Rundiffusion.safetensors] with 722 keys at weight 0.5. Requested to load SDXLClipModel Loading 1 new model unload clone 1 [Fooocus Model Management] Moving model(s) has taken 1.79 seconds [Fooocus] Processing prompts ... [Fooocus] Preparing Fooocus text #1 ... [Prompt Expansion] create a hyperrealistic tilt-shift photo of a businessman in his mid-40s in a suit on a busy Wall Street sidewalk holding a hand-written sign on corrugated cardboard. The sign says "THIS IS NOT REAL." Natural spontaneous facial expression. natural active pose. High contrast. Dark cinematic lighting. Hand-written text on sign says "THIS IS NOT REAL.", sharp focus, intricate, elegant, highly detailed, fine detail, dramatic ambient light, dynamic background, shiny colors, colorful, professional built, bright color, best, gorgeous, great composition, creative, positive, vibrant, artistic, beautiful, atmosphere, perfect, brave, glossy, illuminated, vivid, pretty, attractive, confident, smart, passionate, agile, cool [Fooocus] Preparing Fooocus text #2 ... [Prompt Expansion] create a hyperrealistic tilt-shift photo of a businessman in his mid-40s in a suit on a busy Wall Street sidewalk holding a hand-written sign on corrugated cardboard. The sign says "THIS IS NOT REAL." Natural spontaneous facial expression. natural active pose. High contrast. Dark cinematic lighting. Hand-written text on sign says "THIS IS NOT REAL.", romantic, detailed, sharp focus, intricate, still, highly integrated, great composition, epic, dynamic light, beautiful, elegant, perfect colors, clear background, artistic, shiny, gorgeous, divine, sublime, cool, enhanced, joyful, creative, positive, pure, very handsome, focused, attractive, fine detail, pretty, amazing, awesome, marvelous, inspiring [Fooocus] Preparing Fooocus text #3 ... [Prompt Expansion] create a hyperrealistic tilt-shift photo of a businessman in his mid-40s in a suit on a busy Wall Street sidewalk holding a hand-written sign on corrugated cardboard. The sign says "THIS IS NOT REAL." Natural spontaneous facial expression. natural active pose. High contrast. Dark cinematic lighting. Hand-written text on sign says "THIS IS NOT REAL.", deep focus, handsome, intricate, elegant, highly detailed, wonderful colors, sharp background, dynamic, cute, professional, designed, rich bright color, amazing, shiny, attractive, pretty, classy, best, fantastic, fancy, awesome, dramatic, breathtaking, beautiful, creative, positive, cheerful, pure, focused, relaxed, still, lovely, cool, great [Fooocus] Preparing Fooocus text #4 ... [Prompt Expansion] create a hyperrealistic tilt-shift photo of a businessman in his mid-40s in a suit on a busy Wall Street sidewalk holding a hand-written sign on corrugated cardboard. The sign says "THIS IS NOT REAL." Natural spontaneous facial expression. natural active pose. High contrast. Dark cinematic lighting. Hand-written text on sign says "THIS IS NOT REAL.", romantic, detailed, sharp focus, still, light stunning gorgeous, glowing, attractive, delicate, agile, radiant, iconic, fine, sublime, cool, awesome, brilliant, epic, illuminated, amazing, beautiful, pure, very coherent, perfect, artistic, phenomenal, incredible detail, magical, wonderful, flowing, lush, pretty, focused, fabulous, creative [Fooocus] Encoding positive #1 ... [Fooocus] Encoding positive #2 ... [Fooocus] Encoding positive #3 ... [Fooocus] Encoding positive #4 ... [Fooocus] Encoding negative #1 ... [Fooocus] Encoding negative #2 ... [Fooocus] Encoding negative #3 ... [Fooocus] Encoding negative #4 ... [Fooocus] Image processing ... [Fooocus] VAE encoding ... Requested to load AutoencoderKL Loading 1 new model [Fooocus Model Management] Moving model(s) has taken 0.84 seconds Final resolution is (1024, 1024). [Parameters] Denoising Strength = 0.85 [Parameters] Initial Latent shape: torch.Size([1, 4, 128, 128]) Preparation time: 22.32 seconds [Sampler] refiner_swap_method = joint [Sampler] sigma_min = 0.0291671771556139, sigma_max = 7.829630374908447 Requested to load SDXL Loading 1 new model loading in lowvram mode 3058.9794921875 [Fooocus Model Management] Moving model(s) has taken 50.77 seconds 17%|████████████████████▊ | 10/60 [00:21<01:48, 2.17s/it] User skipped [Sampler] refiner_swap_method = joint [Sampler] sigma_min = 0.0291671771556139, sigma_max = 7.829630374908447 28%|███████████████████████████████████▍ | 17/60 [00:25<01:04, 1.50s/it] User skipped [Sampler] refiner_swap_method = joint [Sampler] sigma_min = 0.0291671771556139, sigma_max = 7.829630374908447 50%|██████████████████████████████████████████████████████████████▌ | 30/60 [00:44<00:44, 1.47s/it] User skipped [Sampler] refiner_swap_method = joint [Sampler] sigma_min = 0.0291671771556139, sigma_max = 7.829630374908447 80%|████████████████████████████████████████████████████████████████████████████████████████████████████ | 48/60 [01:05<00:16, 1.37s/it]Requested to load SDXL Loading 1 new model loading in lowvram mode 3030.8984375 [Fooocus Model Management] Moving model(s) has taken 44.52 seconds Refiner Swapped 80%|████████████████████████████████████████████████████████████████████████████████████████████████████ | 48/60 [01:51<00:27, 2.32s/it] User skipped Requested to load SDXLClipModel Requested to load GPT2LMHeadModel Loading 2 new models Total time: 278.65 seconds [Fooocus Model Management] Moving model(s) has taken 11.50 seconds

Steps to reproduce the problem

Push generate.

What should have happened?

Produce images

What browsers do you use to access Fooocus?

Google Chrome

Where are you running Fooocus?

Locally

What operating system are you using?

windows 11

Console logs

D:\Fooocus>.\python_embeded\python.exe -s Fooocus\entry_with_update.py --always-download-new-model --preset realistic
Already up-to-date
Update succeeded.
[System ARGV] ['Fooocus\\entry_with_update.py', '--always-download-new-model', '--preset', 'realistic']
Python 3.10.9 (tags/v3.10.9:1dd9be6, Dec  6 2022, 20:01:21) [MSC v.1934 64 bit (AMD64)]
Fooocus version: 2.3.1
Loaded preset: D:\Fooocus\Fooocus\presets\realistic.json
[Cleanup] Attempting to delete content of temp dir C:\Users\jmkna\AppData\Local\Temp\fooocus
[Cleanup] Cleanup successful
Total VRAM 6144 MB, total RAM 16200 MB
xformers version: 0.0.20
Set vram state to: NORMAL_VRAM
Always offload VRAM
Device: cuda:0 NVIDIA GeForce RTX 2060 : native
VAE dtype: torch.float32
Using xformers cross attention
Refiner unloaded.
Running on local URL:  http://127.0.0.1:7865

To create a public link, set `share=True` in `launch()`.
model_type EPS
UNet ADM Dimension 2816
Using xformers attention in VAE
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
Using xformers attention in VAE
extra {'cond_stage_model.clip_l.text_projection', 'cond_stage_model.clip_l.logit_scale', 'cond_stage_model.clip_g.transformer.text_model.embeddings.position_ids'}
Base model loaded: D:\Fooocus\Fooocus\models\checkpoints\realisticStockPhoto_v20.safetensors
Request to load LoRAs [['SDXL_FILM_PHOTOGRAPHY_STYLE_BetaV0.4.safetensors', 0.25], ['None', 1.0], ['None', 1.0], ['None', 1.0], ['None', 1.0]] for model [D:\Fooocus\Fooocus\models\checkpoints\realisticStockPhoto_v20.safetensors].
Loaded LoRA [D:\Fooocus\Fooocus\models\loras\SDXL_FILM_PHOTOGRAPHY_STYLE_BetaV0.4.safetensors] for UNet [D:\Fooocus\Fooocus\models\checkpoints\realisticStockPhoto_v20.safetensors] with 788 keys at weight 0.25.
Loaded LoRA [D:\Fooocus\Fooocus\models\loras\SDXL_FILM_PHOTOGRAPHY_STYLE_BetaV0.4.safetensors] for CLIP [D:\Fooocus\Fooocus\models\checkpoints\realisticStockPhoto_v20.safetensors] with 264 keys at weight 0.25.
Fooocus V2 Expansion: Vocab with 642 words.
Fooocus Expansion engine loaded for cuda:0, use_fp16 = True.
Requested to load SDXLClipModel
Requested to load GPT2LMHeadModel
Loading 2 new models
[Fooocus Model Management] Moving model(s) has taken 4.22 seconds
Started worker with PID 12384
App started successful. Use the app with http://127.0.0.1:7865/ or 127.0.0.1:7865
[Parameters] Adaptive CFG = 7
[Parameters] Sharpness = 5
[Parameters] ControlNet Softness = 0.25
[Parameters] ADM Scale = 1.5 : 0.8 : 0.3
[Parameters] CFG = 3.0
[Parameters] Seed = 3664301469950835785
[Parameters] Sampler = dpmpp_2m_sde_gpu - karras
[Parameters] Steps = 60 - 48
[Fooocus] Initializing ...
[Fooocus] Loading models ...
model_type EPS
UNet ADM Dimension 2816
Using xformers attention in VAE
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
Using xformers attention in VAE
extra {'cond_stage_model.clip_l.text_projection', 'cond_stage_model.clip_l.logit_scale', 'cond_stage_model.clip_g.transformer.text_model.embeddings.position_ids'}
Refiner model loaded: D:\Fooocus\Fooocus\models\checkpoints\juggernautXL_v8Rundiffusion.safetensors
Request to load LoRAs [['SDXL_FILM_PHOTOGRAPHY_STYLE_BetaV0.4.safetensors', 0.25], ['add-detail-xl.safetensors', 0.5], ['texta.safetensors', 1.0], ['BetterTextRedmond.safetensors', 1.0], ['None', 1.0]] for model [D:\Fooocus\Fooocus\models\checkpoints\realisticStockPhoto_v20.safetensors].
Loaded LoRA [D:\Fooocus\Fooocus\models\loras\SDXL_FILM_PHOTOGRAPHY_STYLE_BetaV0.4.safetensors] for UNet [D:\Fooocus\Fooocus\models\checkpoints\realisticStockPhoto_v20.safetensors] with 788 keys at weight 0.25.
Loaded LoRA [D:\Fooocus\Fooocus\models\loras\SDXL_FILM_PHOTOGRAPHY_STYLE_BetaV0.4.safetensors] for CLIP [D:\Fooocus\Fooocus\models\checkpoints\realisticStockPhoto_v20.safetensors] with 264 keys at weight 0.25.
Loaded LoRA [D:\Fooocus\Fooocus\models\loras\add-detail-xl.safetensors] for UNet [D:\Fooocus\Fooocus\models\checkpoints\realisticStockPhoto_v20.safetensors] with 722 keys at weight 0.5.
Loaded LoRA [D:\Fooocus\Fooocus\models\loras\add-detail-xl.safetensors] for CLIP [D:\Fooocus\Fooocus\models\checkpoints\realisticStockPhoto_v20.safetensors] with 264 keys at weight 0.5.
Loaded LoRA [D:\Fooocus\Fooocus\models\loras\texta.safetensors] for UNet [D:\Fooocus\Fooocus\models\checkpoints\realisticStockPhoto_v20.safetensors] with 788 keys at weight 1.0.
Loaded LoRA [D:\Fooocus\Fooocus\models\loras\BetterTextRedmond.safetensors] for UNet [D:\Fooocus\Fooocus\models\checkpoints\realisticStockPhoto_v20.safetensors] with 722 keys at weight 1.0.
Request to load LoRAs [['SDXL_FILM_PHOTOGRAPHY_STYLE_BetaV0.4.safetensors', 0.25], ['add-detail-xl.safetensors', 0.5], ['texta.safetensors', 1.0], ['BetterTextRedmond.safetensors', 1.0], ['None', 1.0]] for model [D:\Fooocus\Fooocus\models\checkpoints\juggernautXL_v8Rundiffusion.safetensors].
Loaded LoRA [D:\Fooocus\Fooocus\models\loras\SDXL_FILM_PHOTOGRAPHY_STYLE_BetaV0.4.safetensors] for UNet [D:\Fooocus\Fooocus\models\checkpoints\juggernautXL_v8Rundiffusion.safetensors] with 788 keys at weight 0.25.
Loaded LoRA [D:\Fooocus\Fooocus\models\loras\add-detail-xl.safetensors] for UNet [D:\Fooocus\Fooocus\models\checkpoints\juggernautXL_v8Rundiffusion.safetensors] with 722 keys at weight 0.5.
Loaded LoRA [D:\Fooocus\Fooocus\models\loras\texta.safetensors] for UNet [D:\Fooocus\Fooocus\models\checkpoints\juggernautXL_v8Rundiffusion.safetensors] with 788 keys at weight 1.0.
Loaded LoRA [D:\Fooocus\Fooocus\models\loras\BetterTextRedmond.safetensors] for UNet [D:\Fooocus\Fooocus\models\checkpoints\juggernautXL_v8Rundiffusion.safetensors] with 722 keys at weight 1.0.
Requested to load SDXLClipModel
Loading 1 new model
unload clone 1
[Fooocus Model Management] Moving model(s) has taken 5.20 seconds
[Fooocus] Processing prompts ...
[Fooocus] Preparing Fooocus text #1 ...
[Prompt Expansion] create a hyper realistic photo of a male model in a suit on a busy Wall Street sidewalk holding a hand-written sign that says "THIS IS NOT REAL." Natural, spontaneous facial expression. natural pose, sharp focus, detailed, warm colors, inspired, illustrious, fine detail, elaborate great intricate, beautiful, elegant, amazing composition, cinematic, delicate, epic, cool
[Fooocus] Encoding positive #1 ...
[Fooocus] Encoding negative #1 ...
[Fooocus] Image processing ...
[Fooocus] VAE encoding ...
Requested to load AutoencoderKL
Loading 1 new model
[Fooocus Model Management] Moving model(s) has taken 1.36 seconds
Final resolution is (1024, 1024).
[Parameters] Denoising Strength = 0.85
[Parameters] Initial Latent shape: torch.Size([1, 4, 128, 128])
Preparation time: 187.24 seconds
[Sampler] refiner_swap_method = joint
[Sampler] sigma_min = 0.0291671771556139, sigma_max = 7.829630374908447
Requested to load SDXL
Loading 1 new model
loading in lowvram mode 3107.6340894699097
[Fooocus Model Management] Moving model(s) has taken 68.70 seconds
  0%|                                                                                                                                      | 0/60 [00:04<?, ?it/s]
User stopped
Requested to load SDXLClipModel
Requested to load GPT2LMHeadModel
Loading 2 new models
Total time: 320.88 seconds
[Fooocus Model Management] Moving model(s) has taken 9.10 seconds
[Parameters] Adaptive CFG = 7
[Parameters] Sharpness = 5
[Parameters] ControlNet Softness = 0.25
[Parameters] ADM Scale = 1.5 : 0.8 : 0.3
[Parameters] CFG = 3.0
[Parameters] Seed = 3664301469950835785
[Parameters] Sampler = dpmpp_2m_sde_gpu - karras
[Parameters] Steps = 60 - 48
[Fooocus] Initializing ...
[Fooocus] Loading models ...
[Fooocus] Processing prompts ...
[Fooocus] Preparing Fooocus text #1 ...
[Prompt Expansion] create a hyperrealistic tilt-shift photo of a businessman in his mid-40s in a suit on a busy Wall Street sidewalk holding a hand-written sign that says "THIS IS NOT REAL." Natural spontaneous facial expression. natural active pose, full color, detailed, highly educated, extremely beautiful, inspiring, thought, dramatic cinematic light, attractive, best, perfect
[Fooocus] Encoding positive #1 ...
[Fooocus] Encoding negative #1 ...
[Fooocus] Image processing ...
[Fooocus] VAE encoding ...
Requested to load AutoencoderKL
Loading 1 new model
[Fooocus Model Management] Moving model(s) has taken 0.81 seconds
Final resolution is (1024, 1024).
[Parameters] Denoising Strength = 0.85
[Parameters] Initial Latent shape: torch.Size([1, 4, 128, 128])
Preparation time: 3.96 seconds
[Sampler] refiner_swap_method = joint
[Sampler] sigma_min = 0.0291671771556139, sigma_max = 7.829630374908447
Requested to load SDXL
Loading 1 new model
loading in lowvram mode 3105.710636138916
[Fooocus Model Management] Moving model(s) has taken 61.83 seconds
 80%|████████████████████████████████████████████████████████████████████████████████████████████████████                         | 48/60 [01:10<00:16,  1.39s/it]Requested to load SDXL
Loading 1 new model
loading in lowvram mode 3094.1680431365967
[Fooocus Model Management] Moving model(s) has taken 54.45 seconds
Refiner Swapped
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 60/60 [02:23<00:00,  2.39s/it]
Requested to load AutoencoderKL
Loading 1 new model
[Fooocus Model Management] Moving model(s) has taken 1.71 seconds
Image generated with private log at: D:\Fooocus\Fooocus\outputs\2024-04-11\log.html
Generating and saving time: 212.76 seconds
Requested to load SDXLClipModel
Requested to load GPT2LMHeadModel
Loading 2 new models
Total time: 220.58 seconds
[Fooocus Model Management] Moving model(s) has taken 10.65 seconds
[Parameters] Adaptive CFG = 7
[Parameters] Sharpness = 5
[Parameters] ControlNet Softness = 0.25
[Parameters] ADM Scale = 1.5 : 0.8 : 0.3
[Parameters] CFG = 3.0
[Parameters] Seed = 3664301469950835785
[Parameters] Sampler = dpmpp_2m_sde_gpu - karras
[Parameters] Steps = 60 - 48
[Fooocus] Initializing ...
[Fooocus] Loading models ...
[Fooocus] Processing prompts ...
[Fooocus] Preparing Fooocus text #1 ...
[Prompt Expansion] create a hyperrealistic tilt-shift photo of a businessman in his mid-40s in a suit on a busy Wall Street sidewalk holding a hand-written sign that says "THIS IS NOT REAL." Natural spontaneous facial expression. natural active pose. High contrast. Dark cinematic lighting, great colors, vivid, beautiful, intricate, elegant, highly detailed, complex, very sharp
[Fooocus] Preparing Fooocus text #2 ...
[Prompt Expansion] create a hyperrealistic tilt-shift photo of a businessman in his mid-40s in a suit on a busy Wall Street sidewalk holding a hand-written sign that says "THIS IS NOT REAL." Natural spontaneous facial expression. natural active pose. High contrast. Dark cinematic lighting, great composition, colors, intricate, elegant, light shining, highly detailed, amazing quality
[Fooocus] Preparing Fooocus text #3 ...
[Prompt Expansion] create a hyperrealistic tilt-shift photo of a businessman in his mid-40s in a suit on a busy Wall Street sidewalk holding a hand-written sign that says "THIS IS NOT REAL." Natural spontaneous facial expression. natural active pose. High contrast. Dark cinematic lighting, great composition, thought, elegant, crisp detailed, very artistic, dynamic light, colors
[Fooocus] Preparing Fooocus text #4 ...
[Prompt Expansion] create a hyperrealistic tilt-shift photo of a businessman in his mid-40s in a suit on a busy Wall Street sidewalk holding a hand-written sign that says "THIS IS NOT REAL." Natural spontaneous facial expression. natural active pose. High contrast. Dark cinematic lighting, great composition, beautiful detailed, intricate, elegant, light shining, very coherent, cute
[Fooocus] Encoding positive #1 ...
[Fooocus] Encoding positive #2 ...
[Fooocus] Encoding positive #3 ...
[Fooocus] Encoding positive #4 ...
[Fooocus] Encoding negative #1 ...
[Fooocus] Encoding negative #2 ...
[Fooocus] Encoding negative #3 ...
[Fooocus] Encoding negative #4 ...
[Fooocus] Image processing ...
[Fooocus] VAE encoding ...
Requested to load AutoencoderKL
Loading 1 new model
[Fooocus Model Management] Moving model(s) has taken 0.27 seconds
Final resolution is (1024, 1024).
[Parameters] Denoising Strength = 0.85
[Parameters] Initial Latent shape: torch.Size([1, 4, 128, 128])
Preparation time: 6.56 seconds
[Sampler] refiner_swap_method = joint
[Sampler] sigma_min = 0.0291671771556139, sigma_max = 7.829630374908447
Requested to load SDXL
Loading 1 new model
loading in lowvram mode 3095.1333379745483
[Fooocus Model Management] Moving model(s) has taken 46.25 seconds
 80%|████████████████████████████████████████████████████████████████████████████████████████████████████                         | 48/60 [01:02<00:14,  1.24s/it]Requested to load SDXL
Loading 1 new model
loading in lowvram mode 3085.1292066574097
[Fooocus Model Management] Moving model(s) has taken 39.43 seconds
Refiner Swapped
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 60/60 [01:56<00:00,  1.94s/it]
Requested to load AutoencoderKL
Loading 1 new model
[Fooocus Model Management] Moving model(s) has taken 1.67 seconds
Image generated with private log at: D:\Fooocus\Fooocus\outputs\2024-04-11\log.html
Generating and saving time: 171.17 seconds
[Sampler] refiner_swap_method = joint
[Sampler] sigma_min = 0.0291671771556139, sigma_max = 7.829630374908447
Requested to load SDXL
Loading 1 new model
loading in lowvram mode 3086.0948762893677
[Fooocus Model Management] Moving model(s) has taken 38.38 seconds
 80%|████████████████████████████████████████████████████████████████████████████████████████████████████                         | 48/60 [01:14<00:18,  1.52s/it]Requested to load SDXL
Loading 1 new model
loading in lowvram mode 3076.090744972229
[Fooocus Model Management] Moving model(s) has taken 38.16 seconds
Refiner Swapped
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 60/60 [02:10<00:00,  2.18s/it]
Requested to load AutoencoderKL
Loading 1 new model
[Fooocus Model Management] Moving model(s) has taken 1.18 seconds
Image generated with private log at: D:\Fooocus\Fooocus\outputs\2024-04-11\log.html
Generating and saving time: 174.48 seconds
[Sampler] refiner_swap_method = joint
[Sampler] sigma_min = 0.0291671771556139, sigma_max = 7.829630374908447
Requested to load SDXL
Loading 1 new model
loading in lowvram mode 3077.056414604187
[Fooocus Model Management] Moving model(s) has taken 37.84 seconds
 57%|██████████████████████████████████████████████████████████████████████▊                                                      | 34/60 [00:53<00:41,  1.58s/it]
User skipped
[Sampler] refiner_swap_method = joint
[Sampler] sigma_min = 0.0291671771556139, sigma_max = 7.829630374908447
  5%|██████▎                                                                                                                       | 3/60 [00:05<01:53,  1.99s/it]
User skipped
Requested to load SDXLClipModel
Requested to load GPT2LMHeadModel
Loading 2 new models
Total time: 450.54 seconds
[Fooocus Model Management] Moving model(s) has taken 12.07 seconds
[Parameters] Adaptive CFG = 7
[Parameters] Sharpness = 5
[Parameters] ControlNet Softness = 0.25
[Parameters] ADM Scale = 1.5 : 0.8 : 0.3
[Parameters] CFG = 3.0
[Parameters] Seed = 806013481475394767
[Parameters] Sampler = dpmpp_2m_sde_gpu - karras
[Parameters] Steps = 60 - 48
[Fooocus] Initializing ...
[Fooocus] Loading models ...
Request to load LoRAs [['SDXL_FILM_PHOTOGRAPHY_STYLE_BetaV0.4.safetensors', 0.25], ['add-detail-xl.safetensors', 0.5], ['texta.safetensors', 1.0], ['BetterTextRedmond.safetensors', 0.5], ['None', 1.0]] for model [D:\Fooocus\Fooocus\models\checkpoints\realisticStockPhoto_v20.safetensors].
Loaded LoRA [D:\Fooocus\Fooocus\models\loras\SDXL_FILM_PHOTOGRAPHY_STYLE_BetaV0.4.safetensors] for UNet [D:\Fooocus\Fooocus\models\checkpoints\realisticStockPhoto_v20.safetensors] with 788 keys at weight 0.25.
Loaded LoRA [D:\Fooocus\Fooocus\models\loras\SDXL_FILM_PHOTOGRAPHY_STYLE_BetaV0.4.safetensors] for CLIP [D:\Fooocus\Fooocus\models\checkpoints\realisticStockPhoto_v20.safetensors] with 264 keys at weight 0.25.
Loaded LoRA [D:\Fooocus\Fooocus\models\loras\add-detail-xl.safetensors] for UNet [D:\Fooocus\Fooocus\models\checkpoints\realisticStockPhoto_v20.safetensors] with 722 keys at weight 0.5.
Loaded LoRA [D:\Fooocus\Fooocus\models\loras\add-detail-xl.safetensors] for CLIP [D:\Fooocus\Fooocus\models\checkpoints\realisticStockPhoto_v20.safetensors] with 264 keys at weight 0.5.
Loaded LoRA [D:\Fooocus\Fooocus\models\loras\texta.safetensors] for UNet [D:\Fooocus\Fooocus\models\checkpoints\realisticStockPhoto_v20.safetensors] with 788 keys at weight 1.0.
Loaded LoRA [D:\Fooocus\Fooocus\models\loras\BetterTextRedmond.safetensors] for UNet [D:\Fooocus\Fooocus\models\checkpoints\realisticStockPhoto_v20.safetensors] with 722 keys at weight 0.5.
Request to load LoRAs [['SDXL_FILM_PHOTOGRAPHY_STYLE_BetaV0.4.safetensors', 0.25], ['add-detail-xl.safetensors', 0.5], ['texta.safetensors', 1.0], ['BetterTextRedmond.safetensors', 0.5], ['None', 1.0]] for model [D:\Fooocus\Fooocus\models\checkpoints\juggernautXL_v8Rundiffusion.safetensors].
Loaded LoRA [D:\Fooocus\Fooocus\models\loras\SDXL_FILM_PHOTOGRAPHY_STYLE_BetaV0.4.safetensors] for UNet [D:\Fooocus\Fooocus\models\checkpoints\juggernautXL_v8Rundiffusion.safetensors] with 788 keys at weight 0.25.
Loaded LoRA [D:\Fooocus\Fooocus\models\loras\add-detail-xl.safetensors] for UNet [D:\Fooocus\Fooocus\models\checkpoints\juggernautXL_v8Rundiffusion.safetensors] with 722 keys at weight 0.5.
Loaded LoRA [D:\Fooocus\Fooocus\models\loras\texta.safetensors] for UNet [D:\Fooocus\Fooocus\models\checkpoints\juggernautXL_v8Rundiffusion.safetensors] with 788 keys at weight 1.0.
Loaded LoRA [D:\Fooocus\Fooocus\models\loras\BetterTextRedmond.safetensors] for UNet [D:\Fooocus\Fooocus\models\checkpoints\juggernautXL_v8Rundiffusion.safetensors] with 722 keys at weight 0.5.
Requested to load SDXLClipModel
Loading 1 new model
unload clone 1
[Fooocus Model Management] Moving model(s) has taken 1.79 seconds
[Fooocus] Processing prompts ...
[Fooocus] Preparing Fooocus text #1 ...
[Prompt Expansion] create a hyperrealistic tilt-shift photo of a businessman in his mid-40s in a suit on a busy Wall Street sidewalk holding a hand-written sign on corrugated cardboard. The sign says "THIS IS NOT REAL." Natural spontaneous facial expression. natural active pose. High contrast. Dark cinematic lighting. Hand-written text on sign says "THIS IS NOT REAL.", sharp focus, intricate, elegant, highly detailed, fine detail, dramatic ambient light, dynamic background, shiny colors, colorful, professional built, bright color, best, gorgeous, great composition, creative, positive, vibrant, artistic, beautiful, atmosphere, perfect, brave, glossy, illuminated, vivid, pretty, attractive, confident, smart, passionate, agile, cool
[Fooocus] Preparing Fooocus text #2 ...
[Prompt Expansion] create a hyperrealistic tilt-shift photo of a businessman in his mid-40s in a suit on a busy Wall Street sidewalk holding a hand-written sign on corrugated cardboard. The sign says "THIS IS NOT REAL." Natural spontaneous facial expression. natural active pose. High contrast. Dark cinematic lighting. Hand-written text on sign says "THIS IS NOT REAL.", romantic, detailed, sharp focus, intricate, still, highly integrated, great composition, epic, dynamic light, beautiful, elegant, perfect colors, clear background, artistic, shiny, gorgeous, divine, sublime, cool, enhanced, joyful, creative, positive, pure, very handsome, focused, attractive, fine detail, pretty, amazing, awesome, marvelous, inspiring
[Fooocus] Preparing Fooocus text #3 ...
[Prompt Expansion] create a hyperrealistic tilt-shift photo of a businessman in his mid-40s in a suit on a busy Wall Street sidewalk holding a hand-written sign on corrugated cardboard. The sign says "THIS IS NOT REAL." Natural spontaneous facial expression. natural active pose. High contrast. Dark cinematic lighting. Hand-written text on sign says "THIS IS NOT REAL.", deep focus, handsome, intricate, elegant, highly detailed, wonderful colors, sharp background, dynamic, cute, professional, designed, rich bright color, amazing, shiny, attractive, pretty, classy, best, fantastic, fancy, awesome, dramatic, breathtaking, beautiful, creative, positive, cheerful, pure, focused, relaxed, still, lovely, cool, great
[Fooocus] Preparing Fooocus text #4 ...
[Prompt Expansion] create a hyperrealistic tilt-shift photo of a businessman in his mid-40s in a suit on a busy Wall Street sidewalk holding a hand-written sign on corrugated cardboard. The sign says "THIS IS NOT REAL." Natural spontaneous facial expression. natural active pose. High contrast. Dark cinematic lighting. Hand-written text on sign says "THIS IS NOT REAL.", romantic, detailed, sharp focus, still, light stunning gorgeous, glowing, attractive, delicate, agile, radiant, iconic, fine, sublime, cool, awesome, brilliant, epic, illuminated, amazing, beautiful, pure, very coherent, perfect, artistic, phenomenal, incredible detail, magical, wonderful, flowing, lush, pretty, focused, fabulous, creative
[Fooocus] Encoding positive #1 ...
[Fooocus] Encoding positive #2 ...
[Fooocus] Encoding positive #3 ...
[Fooocus] Encoding positive #4 ...
[Fooocus] Encoding negative #1 ...
[Fooocus] Encoding negative #2 ...
[Fooocus] Encoding negative #3 ...
[Fooocus] Encoding negative #4 ...
[Fooocus] Image processing ...
[Fooocus] VAE encoding ...
Requested to load AutoencoderKL
Loading 1 new model
[Fooocus Model Management] Moving model(s) has taken 0.84 seconds
Final resolution is (1024, 1024).
[Parameters] Denoising Strength = 0.85
[Parameters] Initial Latent shape: torch.Size([1, 4, 128, 128])
Preparation time: 22.32 seconds
[Sampler] refiner_swap_method = joint
[Sampler] sigma_min = 0.0291671771556139, sigma_max = 7.829630374908447
Requested to load SDXL
Loading 1 new model
loading in lowvram mode 3058.9794921875
[Fooocus Model Management] Moving model(s) has taken 50.77 seconds
 17%|████████████████████▊                                                                                                        | 10/60 [00:21<01:48,  2.17s/it]
User skipped
[Sampler] refiner_swap_method = joint
[Sampler] sigma_min = 0.0291671771556139, sigma_max = 7.829630374908447
 28%|███████████████████████████████████▍                                                                                         | 17/60 [00:25<01:04,  1.50s/it]
User skipped
[Sampler] refiner_swap_method = joint
[Sampler] sigma_min = 0.0291671771556139, sigma_max = 7.829630374908447
 50%|██████████████████████████████████████████████████████████████▌                                                              | 30/60 [00:44<00:44,  1.47s/it]
User skipped
[Sampler] refiner_swap_method = joint
[Sampler] sigma_min = 0.0291671771556139, sigma_max = 7.829630374908447
 80%|████████████████████████████████████████████████████████████████████████████████████████████████████                         | 48/60 [01:05<00:16,  1.37s/it]Requested to load SDXL
Loading 1 new model
loading in lowvram mode 3030.8984375
[Fooocus Model Management] Moving model(s) has taken 44.52 seconds
Refiner Swapped
 80%|████████████████████████████████████████████████████████████████████████████████████████████████████                         | 48/60 [01:51<00:27,  2.32s/it]
User skipped
Requested to load SDXLClipModel
Requested to load GPT2LMHeadModel
Loading 2 new models
Total time: 278.65 seconds
[Fooocus Model Management] Moving model(s) has taken 11.50 seconds

Additional information

I did update nVidia Studio driver. But it was happening before. It's why I updated, in the hope it would fix the problem. I am running old transformers on my machine.

mashb1t commented 7 months ago

Please check related issues in https://github.com/lllyasviel/Fooocus/issues?q=is%3Aissue+black+images+is%3Aclosed and try --all-in-fp32 to check if generation results in non-black images.

mashb1t commented 7 months ago

@AFOLcast bump, is there any new information from your side?

mashb1t commented 6 months ago

closing as stale