lllyasviel / Fooocus

Focus on prompting and generating
GNU General Public License v3.0
41.51k stars 5.9k forks source link

[Bug]: Unexpected result using inpainting with certain models #2497

Closed Tominator7 closed 8 months ago

Tominator7 commented 8 months ago

Checklist

What happened?

Using some models (I noticed it with PonyXL based merges), I get unexpected results when using inpainting (both normal and modify version). Changing guidance scale and other values doesn't seem to have much impact on the issue. I tried recreating the problem in Forge with the same models, but inpainting there worked as expected. Clean Fooocus install without extensions or changes, only using non-standard model. Using the "improve detail" inpainting works as expected, only the other options give these results. Example: Original image: 240310_sample Inpainting output (with prompt "wearing a crown"): 240310_sample_inpainting

Steps to reproduce the problem

  1. Clean Fooocus install.
  2. Get models from CivitAI (e.g. Anime Confetti Comrade Mix or AutismMix SDXL)
  3. Try inpainting.

What should have happened?

Image with normal crown on head.

What browsers do you use to access Fooocus?

Mozilla Firefox

Where are you running Fooocus?

Locally

What operating system are you using?

Windows 10

Console logs

C:\Users\Tom\Fooocus_win64_2-1-831>.\python_embeded\python.exe -s Fooocus\entry_with_update.py
Already up-to-date
Update succeeded.
[System ARGV] ['Fooocus\\entry_with_update.py']
Python 3.10.9 (tags/v3.10.9:1dd9be6, Dec  6 2022, 20:01:21) [MSC v.1934 64 bit (AMD64)]
Fooocus version: 2.2.1
Total VRAM 8192 MB, total RAM 16314 MB
Set vram state to: NORMAL_VRAM
Always offload VRAM
Device: cuda:0 NVIDIA GeForce RTX 2080 : native
VAE dtype: torch.float32
Using pytorch cross attention
Refiner unloaded.
Running on local URL:  http://127.0.0.1:7865

To create a public link, set `share=True` in `launch()`.
model_type EPS
UNet ADM Dimension 2816
Using pytorch attention in VAE
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
Using pytorch attention in VAE
extra {'cond_stage_model.clip_g.transformer.text_model.embeddings.position_ids', 'cond_stage_model.clip_l.logit_scale', 'cond_stage_model.clip_l.text_projection'}
Base model loaded: C:\Users\Tom\Fooocus_win64_2-1-831\Fooocus\models\checkpoints\juggernautXL_v8Rundiffusion.safetensors
Request to load LoRAs [['sd_xl_offset_example-lora_1.0.safetensors', 0.1], ['None', 1.0], ['None', 1.0], ['None', 1.0], ['None', 1.0]] for model [C:\Users\Tom\Fooocus_win64_2-1-831\Fooocus\models\checkpoints\juggernautXL_v8Rundiffusion.safetensors].
Loaded LoRA [C:\Users\Tom\Fooocus_win64_2-1-831\Fooocus\models\loras\sd_xl_offset_example-lora_1.0.safetensors] for UNet [C:\Users\Tom\Fooocus_win64_2-1-831\Fooocus\models\checkpoints\juggernautXL_v8Rundiffusion.safetensors] with 788 keys at weight 0.1.
Fooocus V2 Expansion: Vocab with 642 words.
Fooocus Expansion engine loaded for cuda:0, use_fp16 = True.
Requested to load SDXLClipModel
Requested to load GPT2LMHeadModel
Loading 2 new models
[Fooocus Model Management] Moving model(s) has taken 1.57 seconds
Started worker with PID 6444
App started successful. Use the app with http://127.0.0.1:7865/ or 127.0.0.1:7865
[Parameters] Adaptive CFG = 7
[Parameters] Sharpness = 2
[Parameters] ControlNet Softness = 0.25
[Parameters] ADM Scale = 1.5 : 0.8 : 0.3
[Parameters] CFG = 4.0
[Parameters] Seed = 1645858324664140608
[Parameters] Sampler = dpmpp_2m_sde_gpu - karras
[Parameters] Steps = 30 - 15
[Fooocus] Initializing ...
[Fooocus] Loading models ...
Refiner unloaded.
model_type EPS
UNet ADM Dimension 2816
Using pytorch attention in VAE
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
Using pytorch attention in VAE
extra {'cond_stage_model.clip_g.transformer.text_model.embeddings.position_ids', 'cond_stage_model.clip_l.logit_scale', 'cond_stage_model.clip_l.text_projection'}
Base model loaded: C:\Users\Tom\Fooocus_win64_2-1-831\Fooocus\models\checkpoints\autismmixSDXL_autismmixDPO.safetensors
Request to load LoRAs [['None', 0.1], ['None', 1.0], ['None', 1.0], ['None', 1.0], ['None', 1.0]] for model [C:\Users\Tom\Fooocus_win64_2-1-831\Fooocus\models\checkpoints\autismmixSDXL_autismmixDPO.safetensors].
Requested to load SDXLClipModel
Loading 1 new model
[Fooocus Model Management] Moving model(s) has taken 3.49 seconds
[Fooocus] Processing prompts ...
[Fooocus] Encoding positive #1 ...
[Fooocus] Encoding positive #2 ...
[Fooocus] Encoding negative #1 ...
[Fooocus] Encoding negative #2 ...
[Parameters] Denoising Strength = 1.0
[Parameters] Initial Latent shape: Image Space (1024, 1024)
Preparation time: 56.29 seconds
[Sampler] refiner_swap_method = joint
[Sampler] sigma_min = 0.0291671771556139, sigma_max = 14.614643096923828
Requested to load SDXL
Loading 1 new model
[Fooocus Model Management] Moving model(s) has taken 2.69 seconds
100%|██████████████████████████████████████████████████████████████████████████████████| 30/30 [00:16<00:00,  1.82it/s]
Requested to load AutoencoderKL
Loading 1 new model
[Fooocus Model Management] Moving model(s) has taken 1.39 seconds
Image generated with private log at: C:\Users\Tom\Fooocus_win64_2-1-831\Fooocus\outputs\2024-03-10\log.html
Generating and saving time: 22.80 seconds
[Sampler] refiner_swap_method = joint
[Sampler] sigma_min = 0.0291671771556139, sigma_max = 14.614643096923828
Requested to load SDXL
Loading 1 new model
[Fooocus Model Management] Moving model(s) has taken 1.54 seconds
  7%|█████▌                                                                             | 2/30 [00:02<00:29,  1.06s/it]
User stopped
Requested to load SDXLClipModel
Requested to load GPT2LMHeadModel
Loading 2 new models
Total time: 82.81 seconds
[Fooocus Model Management] Moving model(s) has taken 1.87 seconds
[Parameters] Adaptive CFG = 7
[Parameters] Sharpness = 2
[Parameters] ControlNet Softness = 0.25
[Parameters] ADM Scale = 1.5 : 0.8 : 0.3
[Parameters] CFG = 4.0
[Parameters] Seed = 9060872689677232124
[Fooocus] Downloading upscale models ...
[Fooocus] Downloading inpainter ...
[Inpaint] Current inpaint model is C:\Users\Tom\Fooocus_win64_2-1-831\Fooocus\models\inpaint\inpaint_v26.fooocus.patch
[Parameters] Sampler = dpmpp_2m_sde_gpu - karras
[Parameters] Steps = 30 - 15
[Fooocus] Initializing ...
[Fooocus] Loading models ...
Synthetic Refiner Activated
Synthetic Refiner Activated
Request to load LoRAs [['None', 0.1], ['None', 1.0], ['None', 1.0], ['None', 1.0], ['None', 1.0], ('C:\\Users\\Tom\\Fooocus_win64_2-1-831\\Fooocus\\models\\inpaint\\inpaint_v26.fooocus.patch', 1.0)] for model [C:\Users\Tom\Fooocus_win64_2-1-831\Fooocus\models\checkpoints\autismmixSDXL_autismmixDPO.safetensors].
Loaded LoRA [C:\Users\Tom\Fooocus_win64_2-1-831\Fooocus\models\inpaint\inpaint_v26.fooocus.patch] for UNet [C:\Users\Tom\Fooocus_win64_2-1-831\Fooocus\models\checkpoints\autismmixSDXL_autismmixDPO.safetensors] with 960 keys at weight 1.0.
Request to load LoRAs [['None', 0.1], ['None', 1.0], ['None', 1.0], ['None', 1.0], ['None', 1.0]] for model [C:\Users\Tom\Fooocus_win64_2-1-831\Fooocus\models\checkpoints\autismmixSDXL_autismmixDPO.safetensors].
Requested to load SDXLClipModel
Loading 1 new model
unload clone 1
[Fooocus Model Management] Moving model(s) has taken 1.41 seconds
[Fooocus] Processing prompts ...
[Fooocus] Encoding positive #1 ...
[Fooocus] Encoding positive #2 ...
[Fooocus] Encoding negative #1 ...
[Fooocus] Encoding negative #2 ...
[Fooocus] Image processing ...
Upscaling image with shape (662, 753, 3) ...
[Fooocus] VAE Inpaint encoding ...
Requested to load AutoencoderKL
Loading 1 new model
[Fooocus Model Management] Moving model(s) has taken 1.50 seconds
[Fooocus] VAE encoding ...
Final resolution is (1024, 1024), latent is (960, 1088).
[Parameters] Denoising Strength = 1
[Parameters] Initial Latent shape: torch.Size([1, 4, 120, 136])
Preparation time: 20.50 seconds
[Sampler] refiner_swap_method = joint
[Sampler] sigma_min = 0.0291671771556139, sigma_max = 14.614643096923828
Requested to load SDXL
Loading 1 new model
[Fooocus Model Management] Moving model(s) has taken 3.37 seconds
 50%|█████████████████████████████████████████                                         | 15/30 [00:10<00:11,  1.32it/s]Requested to load SDXL
Loading 1 new model
unload clone 0
[Fooocus Model Management] Moving model(s) has taken 2.53 seconds
Refiner Swapped
100%|██████████████████████████████████████████████████████████████████████████████████| 30/30 [00:23<00:00,  1.26it/s]
Requested to load AutoencoderKL
Loading 1 new model
[Fooocus Model Management] Moving model(s) has taken 1.57 seconds
Image generated with private log at: C:\Users\Tom\Fooocus_win64_2-1-831\Fooocus\outputs\2024-03-10\log.html
Generating and saving time: 30.98 seconds
[Sampler] refiner_swap_method = joint
[Sampler] sigma_min = 0.0291671771556139, sigma_max = 14.614643096923828
Requested to load SDXL
Loading 1 new model
[Fooocus Model Management] Moving model(s) has taken 3.61 seconds
 13%|███████████                                                                        | 4/30 [00:05<00:34,  1.33s/it]
User stopped
Requested to load SDXLClipModel
Requested to load GPT2LMHeadModel
Loading 2 new models
Total time: 60.46 seconds
[Fooocus Model Management] Moving model(s) has taken 1.49 seconds

Additional information

No response

mashb1t commented 8 months ago

Which inpaint mode are you using? Please try to use default or improve detail, not add new objects, as this may add different things depending on the model you're using. If you need more control you can find inpainting options in advanced > advanced > developer debug mode > inpaint. Closing as this is not an issue of Fooocus.

Tominator7 commented 8 months ago

Tried default and modify. The above example is using the default inpaint mode.

soctib commented 7 months ago

Can confirm. Any pony-derived model produces complete garbage when used for in/outpainting. At the very least this should be documented.