KohakuBlueleaf / a1111-sd-webui-lycoris

An extension for stable-diffusion-webui to load lycoris models.
Apache License 2.0
861 stars 116 forks source link

Different results when calling Lora file through SD-lora extension and SD-lycoris extension #41

Closed TheOnlyHolyMoly closed 1 year ago

TheOnlyHolyMoly commented 1 year ago

Hello,

I was making a test yesterday with SD.next and part of my test-cases was using lora / lyco (in anticipation of the patched version of the sd-lycoris extension) > https://github.com/KohakuBlueleaf/a1111-sd-webui-lycoris/issues/34

So I used this Lora here (https://civitai.com/models/45622/90s-tmnt-donatello-realistic) and the settings given at the bottom. I had put the lora file into both models/lora and models/lycoris folder and then called the lora from the prompt alternatively using lora: and lyco:

What I found out: Calling the lora file through lora: gives several times the same image (image A) Calling the lora through lyco: give several times the same image (Image B)

Image A and B are fully different (none of them beautiful, but it only about a test of functionality). I wanted to ask if you have an idea, where the difference could come from? is a1111 evaluating the lora in a buggy manner? Keeping @vladmandic in the loop.

00073-1842838212-Photography, pink, Sports car, best quality, masterpiece, TMNTDonatello_1 9  teenage mutant ninja turtle

00074-1842838212-Photography, pink, Sports car, best quality, masterpiece, TMNTDonatello_1 9  teenage mutant ninja turtle

Prompt: Photography, pink, Sports car, best quality, masterpiece, TMNTDonatello:1.9 teenage mutant ninja turtle Negative prompt: Worst quality, bad quality, low effort Steps: 26 | Sampler: DPM++ 2M SDE Karras | CFG scale: 10 | Seed: 1842838212 | Face restoration: GFPGAN | Size: 512x512 | Model hash: 6ce0161689 | Model: v1-5-pruned-emaonly | VAE: vae-ft-mse-840000-ema-pruned (1) | Denoising strength: 0.5 | Clip skip: 1 | Version: 02c9640 | Parser: Full parser | Hires upscale: 2 | Hires steps: 14 | Hires upscaler: Latent | Dynamic thresholding enabled: True | Mimic scale: 7 | Threshold percentile: 100 | Mimic mode: Power Down | Mimic scale minimum: 0 | CFG mode: Power Down | CFG scale minimum: 0 | Power scheduler value: 4

File metadata parameters: Photography, pink, Sports car, best quality, masterpiece, TMNTDonatello:1.9 teenage mutant ninja turtle Negative prompt: Worst quality, bad quality, low effort Steps: 26, Sampler: DPM++ 2M SDE Karras, CFG scale: 10, Seed: 1842838212, Face restoration: GFPGAN, Size: 512x512, Model hash: 6ce0161689, Model: v1-5-pruned-emaonly, VAE: vae-ft-mse-840000-ema-pruned (1), Denoising strength: 0.5, Clip skip: 1, Version: 02c9640, Parser: Full parser, Hires upscale: 2, Hires steps: 14, Hires upscaler: Latent, Dynamic thresholding enabled: True, Mimic scale: 7, Threshold percentile: 100, Mimic mode: Power Down, Mimic scale minimum: 0, CFG mode: Power Down, CFG scale minimum: 0, Power scheduler value: 4

vladmandic commented 1 year ago

i cant reproduce, i'm getting identical results (within reason as i'm not using deterministic cross-optimization). not sure if its a copy&paste or quote error or is there actual problem with your prompt as file-name appears twice, once without and once with lyco prefix?

TheOnlyHolyMoly commented 1 year ago

@vladmandic interesting remark, indeed this was a result I had accomplished on the DirectML enviroment with subquadratic... I will run this again on my Cuda+Torch Live Enviroment. I'll quickly run this again..maybe my eyes were just skew last night.

TheOnlyHolyMoly commented 1 year ago

okay. double checked, so we have a deviation between live and test-environment which translates into "conistency of lora/lyco call in torch/cuda/sdp setup" and "inconsistency in directml/subquadratic setup".

IMAGE A (straight copy from process image tab) Prompt: Photography, pink, Sports car, best quality, masterpiece, TMNTDonatello:1.9 teenage mutant ninja turtle Negative prompt: Worst quality, bad quality, low effort Steps: 26 | Sampler: DPM++ 2M SDE Karras | CFG scale: 10 | Seed: 1842838212 | Face restoration: GFPGAN | Size: 512x512 | Model hash: 6ce0161689 | Model: v1-5-pruned-emaonly | VAE: vae-ft-mse-840000-ema-pruned (1) | Denoising strength: 0.5 | Clip skip: 1 | Version: 02c9640 | Parser: Full parser | Hires upscale: 2 | Hires steps: 14 | Hires upscaler: Latent | Lora hashes: TMNTDonatello: 9c427126fe80 | Dynamic thresholding enabled: True | Mimic scale: 7 | Threshold percentile: 100 | Mimic mode: Power Down | Mimic scale minimum: 0 | CFG mode: Power Down | CFG scale minimum: 0 | Power scheduler value: 4

File metadata parameters: Photography, pink, Sports car, best quality, masterpiece, TMNTDonatello:1.9 teenage mutant ninja turtle Negative prompt: Worst quality, bad quality, low effort Steps: 26, Sampler: DPM++ 2M SDE Karras, CFG scale: 10, Seed: 1842838212, Face restoration: GFPGAN, Size: 512x512, Model hash: 6ce0161689, Model: v1-5-pruned-emaonly, VAE: vae-ft-mse-840000-ema-pruned (1), Denoising strength: 0.5, Clip skip: 1, Version: 02c9640, Parser: Full parser, Hires upscale: 2, Hires steps: 14, Hires upscaler: Latent, Lora hashes: TMNTDonatello: 9c427126fe80, Dynamic thresholding enabled: True, Mimic scale: 7, Threshold percentile: 100, Mimic mode: Power Down, Mimic scale minimum: 0, CFG mode: Power Down, CFG scale minimum: 0, Power scheduler value: 4

Image B (straight copy from process image tab)

Prompt: Photography, pink, Sports car, best quality, masterpiece, TMNTDonatello:1.9 teenage mutant ninja turtle Negative prompt: Worst quality, bad quality, low effort Steps: 26 | Sampler: DPM++ 2M SDE Karras | CFG scale: 10 | Seed: 1842838212 | Face restoration: GFPGAN | Size: 512x512 | Model hash: 6ce0161689 | Model: v1-5-pruned-emaonly | VAE: vae-ft-mse-840000-ema-pruned (1) | Denoising strength: 0.5 | Clip skip: 1 | Version: 02c9640 | Parser: Full parser | Hires upscale: 2 | Hires steps: 14 | Hires upscaler: Latent | Dynamic thresholding enabled: True | Mimic scale: 7 | Threshold percentile: 100 | Mimic mode: Power Down | Mimic scale minimum: 0 | CFG mode: Power Down | CFG scale minimum: 0 | Power scheduler value: 4

File metadata parameters: Photography, pink, Sports car, best quality, masterpiece, TMNTDonatello:1.9 teenage mutant ninja turtle Negative prompt: Worst quality, bad quality, low effort Steps: 26, Sampler: DPM++ 2M SDE Karras, CFG scale: 10, Seed: 1842838212, Face restoration: GFPGAN, Size: 512x512, Model hash: 6ce0161689, Model: v1-5-pruned-emaonly, VAE: vae-ft-mse-840000-ema-pruned (1), Denoising strength: 0.5, Clip skip: 1, Version: 02c9640, Parser: Full parser, Hires upscale: 2, Hires steps: 14, Hires upscaler: Latent, Dynamic thresholding enabled: True, Mimic scale: 7, Threshold percentile: 100, Mimic mode: Power Down, Mimic scale minimum: 0, CFG mode: Power Down, CFG scale minimum: 0, Power scheduler value: 4

Double checked on DirectML + Subquadratic Enviroment > absolutely gives me two fully distinct images reproducably upon change between lyco and lora prompt call.

00094-1842838212-Photography, pink, Sports car, best quality, masterpiece, TMNTDonatello_1 9  teenage mutant ninja turtle

00095-1842838212-Photography, pink, Sports car, best quality, masterpiece, TMNTDonatello_1 9  teenage mutant ninja turtle

Now validating the live system...and it is consistent image image

TheOnlyHolyMoly commented 1 year ago

so effectively it does not seem to be a lycoris handler issue as I thought originally, so I'm fine closing this here. How do we handle the non-deterministic behaviour in the test enviroment? any other settings I should use? @vladmandic