leejet / stable-diffusion.cpp

Stable Diffusion in pure C/C++
MIT License
2.94k stars 238 forks source link

Poor img2img results. Is it ok? #79

Closed alexanderbrodko closed 7 months ago

alexanderbrodko commented 8 months ago

txt2img works fine for me but img2img gives blurry abstract images

original image initial image

./sd.exe -m v2.bin -p "old two-storied american mansion entrance porch, bushes, second floor, door and windows nailed up with boards" -t 6 --sampling-method dpm++2mv2 --mode img2img -i Untitled.jpg --strength 0.2 --seed -1 strength=0.2

--strength 0.7 strength=0.7

I tried several images and different sampling methods, tried negative prompt "blur, blurry", all gives results like these. Model is 512-base-ema.ckpt (v2 base model, works fine for txt2img)

Full output
$ ./sd.exe -m v2.bin -p "old two-storied american mansion entrance porch, bushes, second floor, door and windows nailed up with boards" -t 6 --sampling-method dpm++2mv2 --seed -1 --mode img2img -i Untitled.jpg --strength 0.7 -v
Option:
    n_threads:       6
    mode:            img2img
    model_path:      v2.bin
    output_path:     output.png
    init_img:        Untitled.jpg
    prompt:          old two-storied american mansion entrance porch, bushes, se
cond floor, door and windows nailed up with boards
    negative_prompt:
    cfg_scale:       7.00
    width:           512
    height:          512
    sample_method:   dpm++2mv2
    schedule:        default
    sample_steps:    20
    strength:        0.70
    rng:             cuda
    seed:            28994
System Info:
    BLAS = 0
    SSE3 = 1
    AVX = 1
    AVX2 = 1
    AVX512 = 0
    AVX512_VBMI = 0
    AVX512_VNNI = 0
    FMA = 1
    NEON = 0
    ARM_FMA = 0
    F16C = 1
    FP16_VA = 0
    WASM_SIMD = 0
    VSX = 0
[INFO]  stable-diffusion.cpp:2832 - loading model from 'v2.bin'
[DEBUG] stable-diffusion.cpp:2840 - verifying magic
[DEBUG] stable-diffusion.cpp:2851 - loading hparams
[INFO]  stable-diffusion.cpp:2860 - model type: SD2.x
[INFO]  stable-diffusion.cpp:2868 - ftype: q8_0
[DEBUG] stable-diffusion.cpp:2874 - loading vocab
[DEBUG] stable-diffusion.cpp:2902 - ggml tensor size = 320 bytes
[DEBUG] stable-diffusion.cpp:2907 - clip params ctx size =  360.00 MB
[DEBUG] stable-diffusion.cpp:2926 - unet params ctx size =  1406.42 MB
[DEBUG] stable-diffusion.cpp:2947 - vae params ctx size =  179.12 MB
[DEBUG] stable-diffusion.cpp:2968 - preparing memory for the weights
[DEBUG] stable-diffusion.cpp:2984 - loading weights
[DEBUG] stable-diffusion.cpp:3087 - model size = 1923.54MB
[INFO]  stable-diffusion.cpp:3096 - total params size = 1923.98MB (clip 358.70MB
, unet 1405.51MB, vae 159.77MB)
[INFO]  stable-diffusion.cpp:3098 - loading model from 'v2.bin' completed, takin
[DEBUG] stable-diffusion.cpp:3431 - diffusion context need 16.61MB static memory
, with work_size needing 5.31MB
[INFO]  stable-diffusion.cpp:3892 - sampling using modified DPM++ (2M) method
[INFO]  stable-diffusion.cpp:3561 - step 1 sampling completed, taking 31.30s
[DEBUG] stable-diffusion.cpp:3565 - diffusion graph use 396.74MB runtime memory:
 static 16.61MB, dynamic 380.13MB
[DEBUG] stable-diffusion.cpp:3566 - 66560 bytes of dynamic memory has not been r
eleased yet
[INFO]  stable-diffusion.cpp:3561 - step 2 sampling completed, taking 30.70s
[DEBUG] stable-diffusion.cpp:3565 - diffusion graph use 396.74MB runtime memory:
 static 16.61MB, dynamic 380.13MB
[DEBUG] stable-diffusion.cpp:3566 - 66560 bytes of dynamic memory has not been r
eleased yet
[INFO]  stable-diffusion.cpp:3561 - step 3 sampling completed, taking 32.15s
[DEBUG] stable-diffusion.cpp:3565 - diffusion graph use 396.74MB runtime memory:
 static 16.61MB, dynamic 380.13MB
[DEBUG] stable-diffusion.cpp:3566 - 66560 bytes of dynamic memory has not been r
eleased yet
[INFO]  stable-diffusion.cpp:3561 - step 4 sampling completed, taking 31.58s
[DEBUG] stable-diffusion.cpp:3565 - diffusion graph use 396.74MB runtime memory:
 static 16.61MB, dynamic 380.13MB
[DEBUG] stable-diffusion.cpp:3566 - 66560 bytes of dynamic memory has not been r
eleased yet
[INFO]  stable-diffusion.cpp:3561 - step 5 sampling completed, taking 31.49s
[DEBUG] stable-diffusion.cpp:3565 - diffusion graph use 396.74MB runtime memory:
 static 16.61MB, dynamic 380.13MB
[DEBUG] stable-diffusion.cpp:3566 - 66560 bytes of dynamic memory has not been r
eleased yet
[INFO]  stable-diffusion.cpp:3561 - step 6 sampling completed, taking 32.33s
[DEBUG] stable-diffusion.cpp:3565 - diffusion graph use 396.74MB runtime memory:
 static 16.61MB, dynamic 380.13MB
[DEBUG] stable-diffusion.cpp:3566 - 66560 bytes of dynamic memory has not been r
eleased yet
[INFO]  stable-diffusion.cpp:3561 - step 7 sampling completed, taking 32.68s
[DEBUG] stable-diffusion.cpp:3565 - diffusion graph use 396.74MB runtime memory:
 static 16.61MB, dynamic 380.13MB
[DEBUG] stable-diffusion.cpp:3566 - 66560 bytes of dynamic memory has not been r
eleased yet
[INFO]  stable-diffusion.cpp:3561 - step 8 sampling completed, taking 32.13s
[DEBUG] stable-diffusion.cpp:3565 - diffusion graph use 396.74MB runtime memory:
 static 16.61MB, dynamic 380.13MB
[DEBUG] stable-diffusion.cpp:3566 - 66560 bytes of dynamic memory has not been r
eleased yet
[INFO]  stable-diffusion.cpp:3561 - step 9 sampling completed, taking 31.19s
[DEBUG] stable-diffusion.cpp:3565 - diffusion graph use 396.74MB runtime memory:
 static 16.61MB, dynamic 380.13MB
[DEBUG] stable-diffusion.cpp:3566 - 66560 bytes of dynamic memory has not been r
eleased yet
[INFO]  stable-diffusion.cpp:3561 - step 10 sampling completed, taking 31.08s
[DEBUG] stable-diffusion.cpp:3565 - diffusion graph use 396.74MB runtime memory:
 static 16.61MB, dynamic 380.13MB
[DEBUG] stable-diffusion.cpp:3566 - 66560 bytes of dynamic memory has not been r
eleased yet
[INFO]  stable-diffusion.cpp:3561 - step 11 sampling completed, taking 31.47s
[DEBUG] stable-diffusion.cpp:3565 - diffusion graph use 396.74MB runtime memory:
 static 16.61MB, dynamic 380.13MB
[DEBUG] stable-diffusion.cpp:3566 - 66560 bytes of dynamic memory has not been r
eleased yet
[INFO]  stable-diffusion.cpp:3561 - step 12 sampling completed, taking 33.07s
[DEBUG] stable-diffusion.cpp:3565 - diffusion graph use 396.74MB runtime memory:
 static 16.61MB, dynamic 380.13MB
[DEBUG] stable-diffusion.cpp:3566 - 66560 bytes of dynamic memory has not been r
eleased yet
[INFO]  stable-diffusion.cpp:3561 - step 13 sampling completed, taking 39.65s
[DEBUG] stable-diffusion.cpp:3565 - diffusion graph use 396.74MB runtime memory:
 static 16.61MB, dynamic 380.13MB
[DEBUG] stable-diffusion.cpp:3566 - 66560 bytes of dynamic memory has not been r
eleased yet
[INFO]  stable-diffusion.cpp:3561 - step 14 sampling completed, taking 34.51s
[DEBUG] stable-diffusion.cpp:3565 - diffusion graph use 396.74MB runtime memory:
 static 16.61MB, dynamic 380.13MB
[DEBUG] stable-diffusion.cpp:3566 - 66560 bytes of dynamic memory has not been r
eleased yet
[INFO]  stable-diffusion.cpp:3561 - step 15 sampling completed, taking 38.46s
[DEBUG] stable-diffusion.cpp:3565 - diffusion graph use 396.74MB runtime memory:
 static 16.61MB, dynamic 380.13MB
[DEBUG] stable-diffusion.cpp:3566 - 66560 bytes of dynamic memory has not been r
eleased yet
[INFO]  stable-diffusion.cpp:3960 - diffusion graph use 1802.26MB of memory: par
ams 1405.51MB, runtime 396.74MB (static 16.61MB, dynamic 380.13MB)
[DEBUG] stable-diffusion.cpp:3961 - 66560 bytes of dynamic memory has not been r
eleased yet
[INFO]  stable-diffusion.cpp:4367 - sampling completed, taking 493.82s
[DEBUG] stable-diffusion.cpp:4131 - vae context need 10.16MB static memory, with
 work_size needing 0.00MB
[DEBUG] stable-diffusion.cpp:4162 - computing vae graph completed, taking 71.56s
[INFO]  stable-diffusion.cpp:4185 - vae graph use 2220.92MB of memory: params 15
9.77MB, runtime 2061.16MB (static 10.16MB, dynamic 2051.00MB)
[DEBUG] stable-diffusion.cpp:4186 - 3146752 bytes of dynamic memory has not been
 released yet
[INFO]  stable-diffusion.cpp:4379 - decode_first_stage completed, taking 71.61s
[INFO]  stable-diffusion.cpp:4393 - img2img completed in 599.98s, use 3535.86MB
of memory: peak params memory 1923.98MB, peak runtime memory 2061.16MB
save result image to 'output.png'
alexanderbrodko commented 8 months ago

Python version gives such results 00015

Jonathhhan commented 8 months ago

@alexanderbrodko I get similar results. For me it only works well with this model (128x128): https://huggingface.co/NikolayKozloff/stable-diffusion-nano-2-1-ggml/tree/main Same with the new .gguf model format.

Jonathhhan commented 7 months ago

I found a good setting for img2img: No TAESD (crashes). vae_decode_only = false (otherwise it crashes). Input and output image need to have the same size, otherwise strange artifacts. strength = 0.5. More sample steps (30-40). EULER_A, DPMPP2S_A or LCM sample method. Do not generate images with a size of 768*768 or bigger, otherwise it crashes. Lora adapter seems to improve the result.

Tested with: v1-5-pruned-emaonly-f16.gguf

leejet commented 7 months ago

The issue should be fixed now. You can pull the latest code and give it another try.

alexanderbrodko commented 7 months ago

It works so fine now. Thanks!

img2img_output

alexanderbrodko commented 7 months ago

Also cat with blue eyes and strength=0.4, strongly differs from smooth one at README.md

img2img_output1

leejet commented 7 months ago

Also cat with blue eyes and strength=0.4, strongly differs from smooth one at README.md

img2img_output1

I will find some time to update the pictures in the readme, many things are different.

Jonathhhan commented 7 months ago

It works so fine now. Thanks!

I want to second that. Really great.