VAE (vae-ft-mse-840000-ema-pruned.ckpt) Documentation

Any-Winter-4079 commented 1 year ago

Finally got to try vae-ft-mse-840000-ema-pruned.ckpt and my first impression is wow: seems to do better with eyes! Have you noticed some other benefits (e.g. for landscapes, etc.)? I think it would be good to document the strengths and weaknesses of this new vae and also add it to the docs.

mauwii commented 1 year ago

what where the prompts you used?

Any-Winter-4079 commented 1 year ago

miranda kerr closeup -s 50 -S 2328329510 -W 512 -H 512 -C 7.5 -A k_lms
miranda kerr closeup -s 50 -S 1641815931 -W 512 -H 512 -C 7.5 -A k_lms
miranda kerr closeup -s 50 -S 1502365647 -W 512 -H 512 -C 7.5 -A k_lms

But I assume it generalizes to other prompts.

mauwii commented 1 year ago

Would like to compare what they look on my end, since I did not have those problematic eyes.

Will test after MacOS Update is done ^^

michaelezra commented 1 year ago

Are there any instructions how to use it with InvokeAI?

mauwii commented 1 year ago

not sure if it is related to the updated macOS Ventura, but when running in main branch I know have 5.s/it (instead of 2.s/it) 🤔

ok, also got weird eyes now xD

miranda kerr closeup -s 50 -S 2328329510 -W 512 -H 512 -C 7.5 -A k_lms 000429 2328329510

after re-downloading FFHQ_eye_mouth_landmarks_512.pth and StyleGAN2_512_Cmul1_FFHQ_B12G4_scratch_800k.pth to src/gfpgan/experiments/pretrained_modelsit looks much better again 🙈

miranda kerr closeup" -s 50 -S 2328329510 -W 512 -H 512 -C 7.5 -A k_lms -G 0.7

Any-Winter-4079 commented 1 year ago

Are there any instructions how to use it with InvokeAI?

@michaelezra In models.yaml:

stable-diffusion-1.4:
  config: configs/stable-diffusion/v1-inference.yaml
  weights: models/ldm/stable-diffusion-v1/model.ckpt
  description: Stable Diffusion inference model version 1.4
  vae: models/ldm/stable-diffusion-v1/vae-ft-mse-840000-ema-pruned.ckpt
  width: 512
  height: 512

add vae to the model(s). Download: https://huggingface.co/stabilityai/sd-vae-ft-mse-original/tree/main

Any-Winter-4079 commented 1 year ago

@mauwii Interesting. I didn't use FFHQ_eye_mouth_landmarks_512.pth or StyleGAN2_512_Cmul1_FFHQ_B12G4_scratch_800k.pth for any of the results posted above, but it seems to achieve a similar purpose as the new variational auto encoder: vae-ft-mse-840000-ema-pruned.ckpt.

One thing. When you run miranda kerr closeup -s 50 -S 2328329510 -W 512 -H 512 -C 7.5 -A k_lms GFPGAN is not passed (nor are you using the new variational auto encoder), so it's normal to see the poor-looking eyes (I think).

after re-downloading FFHQ_eye_mouth_landmarks_512.pth and StyleGAN2_512_Cmul1_FFHQ_B12G4_scratch_800k.pth to src/gfpgan/experiments/pretrained_modelsit looks much better again 🙈

I assume after doing this, your command changed and you passed some post-processing options? e.g. miranda kerr closeup -s 50 -S 2328329510 -W 512 -H 512 -C 7.5 -A k_lms -G 0.6 -ft gfpgan

Any-Winter-4079 commented 1 year ago

Some comparison w/ all possibilities using GFPGAN / VAE:

mauwii commented 1 year ago

I assume after doing this, your command changed and you passed some post-processing options? e.g. miranda kerr closeup -s 50 -S 2328329510 -W 512 -H 512 -C 7.5 -A k_lms -G 0.6 -ft gfpgan

Oh, yes, sorry, should have mentioned this. What happened: When I created the first picture i was totally iritated since my GFPGAN was broken as well so even postprocessing wasnt working anymore (which I guess came from about 100 reinstallations while "playing" with the environment-mac.yml 🙈).

Long story short - the second picture was postprocessed with GFPGAN 🤦🏻‍♂️ Hope I did not lead you on some non-existing route!

Any-Winter-4079 commented 1 year ago

All good! I figured you were passing GFPGAN to get so good of a result :) Here's another comparison:

mauwii commented 1 year ago

I edited the post and added the prompt miranda kerr closeup" -s 50 -S 2328329510 -W 512 -H 512 -C 7.5 -A k_lms -G 0.7 which should be the one I used there

Any-Winter-4079 commented 1 year ago

Do you think it may be useful to include these results / comparisons in https://github.com/invoke-ai/InvokeAI/blob/development/docs/features/POSTPROCESS.md? There's no photos right now that document what one may expect.

For example, using GFPGAN does help, but it may be useful to use with the new VAE. See in https://github.com/invoke-ai/InvokeAI/issues/1279#issuecomment-1295832189 how GFPGAN-only eyes are a bit more blurry than VAE-only eyes and VAE-GFPGAN eyes.

Note: There's the option of increasing -G (face restoration strength), to try to improve the eyes, but it will also smoothen details in the face, and you can't decouple it from eyes-restoration-only as far as I know. So combining GFPGAN with VAE may be a nice idea (e.g. say you want wrinkles in the face for an old person but also good eyes).

mauwii commented 1 year ago

I guess your excelent comparison-screenshots would ofc fit very well onto the postprocessing docs, since Pics (especially in this case) sometimes say more than 100 lines of docs could do 🙈

michaelezra commented 1 year ago

@Any-Winter-4079 Thanks!

in models.yaml I tried v1.5:

stable-diffusion-1.5:
    config:  configs/stable-diffusion/v1-inference.yaml
    weights: models/ldm/stable-diffusion-v1/v1-5-pruned-emaonly.ckpt
    description: Stable Diffusion inference model version 1.5
    vae: models/ldm/stable-diffusion-v1/vae-ft-mse-840000-ema-pruned.ckpt
    width: 512
    height: 512

I created a symlink to vae-ft-mse-840000-ema-pruned.ckpt and starting invokeai:

`(invokeai) michaelezra@Michaels-MBP stable-diffusion % PATH_TO_CKPT="/Users/michaelezra/ai/CKPT"
(invokeai) michaelezra@Michaels-MBP stable-diffusion % ln -s "$PATH_TO_CKPT/vae-ft-mse-840000-ema-pruned.ckpt" models/ldm/stable-diffusion-v1/vae-ft-mse-840000-ema-pruned.ckpt

(invokeai) michaelezra@Michaels-MBP stable-diffusion % python scripts/invoke.py --web --model stable-diffusion-1.5

Initializing, be patient... NOTE: Redirects are currently not supported in Windows or MacOs.

GFPGAN Initialized CodeFormer Initialized ESRGAN Initialized Using device_type mps Loading stable-diffusion-1.5 from models/ldm/stable-diffusion-v1/v1-5-pruned-emaonly.ckpt | LatentDiffusion: Running in eps-prediction mode | DiffusionWrapper has 859.52 M params. | Making attention of type 'vanilla' with 512 in_channels | Working with z of shape (1, 4, 32, 32) = 4096 dimensions. | Making attention of type 'vanilla' with 512 in_channels | Using more accurate float32 precision Model loaded in 5.93s Setting Sampler to k_lms`

Does this look right - no message about vae. Should I use any additional command line parameter for invoke.py?

mauwii commented 1 year ago

Do you think it may be useful to include these results / comparisons in https://github.com/invoke-ai/InvokeAI/blob/development/docs/features/POSTPROCESS.md? There's no photos right now that document what one may expect.

Totaly agree. I can maybe give you a good advice about how to edit the docs comfortably:

# while in repository-root
python -m venv .venv
. .venv/bin/activate
pip install -r requirements-mkdocs.txt
mkdocs serve

this spins up a local version of the github-page so that you have a live preview (which also reloads as soon as you save some changes).

I used it so often that in the meantime I added the mkdocs requirements to my conda base env 😅

For example, using GFPGAN does help, but it may be useful to use with the new VAE. See in #1279 (comment) how GFPGAN-only eyes are a bit more blurry than VAE-only eyes and VAE-GFPGAN eyes.

Note: There's the option of increasing -G (face restoration strength), to try to improve the eyes, but it will also smoothen details in the face, and you can't decouple it from eyes-restoration-only as far as I know. So combining GFPGAN with VAE may be a nice idea (e.g. say you want wrinkles in the face for an old person but also good eyes).

Yeah, I also noted that "blurrynes". I mostly used a value between 6-7.5 with GFPGAN, which varies depending on the picture itself, but here it is recommended even between 5-8, in the end it's also kind of a personal flavor I guess.

Combining the postprocessors Sounds legit, but unfortunatelly I didn't dive that deep into it yet 🙈 (I even did never use the whole stuff like inpainting, outpainting, ..., but the day will come ;P )

@michaelezra: since I just want to try this as well, this is how it looks for me:

models.yml:

stable-diffusion-1.5:
  config: configs/stable-diffusion/v1-inference.yaml
  weights: models/ldm/stable-diffusion-v1/v1-5-pruned-emaonly.ckpt
  vae: models/ldm/stable-diffusion-v1/vae-ft-mse-840000-ema-pruned.ckpt
  description: Stable Diffusion inference model version 1.5
  width: 512
  height: 512

running invoke.py:

python scripts/invoke.py --model stable-diffusion-1.5
* Initializing, be patient...
NOTE: Redirects are currently not supported in Windows or MacOs.
>> GFPGAN Initialized
>> CodeFormer Initialized
>> ESRGAN Initialized
>> Using device_type mps
>> Loading stable-diffusion-1.5 from models/ldm/stable-diffusion-v1/v1-5-pruned-emaonly.ckpt
   | LatentDiffusion: Running in eps-prediction mode
   | DiffusionWrapper has 859.52 M params.
   | Making attention of type 'vanilla' with 512 in_channels
   | Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
   | Making attention of type 'vanilla' with 512 in_channels
   | Using more accurate float32 precision
   | Loading VAE weights from: models/ldm/stable-diffusion-v1/vae-ft-mse-840000-ema-pruned.ckpt
>> Model loaded in 12.44s
>> Setting Sampler to k_lms

* Initialization done! Awaiting your command (-h for help, 'q' to quit)
invoke>

VAE weights are also "only" symlinked to models/ldm/stable-diffusion-v1/vae-ft-mse-840000-ema-pruned.ckpt

Any-Winter-4079 commented 1 year ago

I created a symlink to vae-ft-mse-840000-ema-pruned.ckpt

I just downloaded the file and put it into models/ldm/stable-diffusion-v1, so I don't know what the issue may be. If a symlink doesn't work in your setup, try putting the ckpt directly in the folder.

michaelezra commented 1 year ago

strange.. I am still not getting this, after placing the file directly and changing the order in models.yaml, putting vae: after weights:

Loading VAE weights from: models/ldm/stable-diffusion-v1/vae-ft-mse-840000-ema-pruned.ckpt

Any-Winter-4079 commented 1 year ago

So 1.5 is the default model you load? I don't see default: true here

stable-diffusion-1.5:
    config:  configs/stable-diffusion/v1-inference.yaml
    weights: models/ldm/stable-diffusion-v1/v1-5-pruned-emaonly.ckpt
    description: Stable Diffusion inference model version 1.5
    vae: models/ldm/stable-diffusion-v1/vae-ft-mse-840000-ema-pruned.ckpt
    width: 512
    height: 512

so unless you switch models, it shouldn't load the vae ckpt.

Basically, for every model you want to use this vae with, you have to add the vae line, e.g.

waifu-1.3:
  config: configs/stable-diffusion/v1-inference.yaml
  weights: models/ldm/waifu-diffusion/inference/wd-v1-3-float32.ckpt
  description: Waifu Diffusion inference model version 1.3
  vae: models/ldm/stable-diffusion-v1/vae-ft-mse-840000-ema-pruned.ckpt
  width: 512
  height: 512
waifu-1.3-full:
  config: configs/stable-diffusion/v1-inference.yaml
  weights: models/ldm/waifu-diffusion/inference/wd-v1-3-full.ckpt
  description: Waifu Diffusion inference model version 1.3 (full)
  vae: models/ldm/stable-diffusion-v1/vae-ft-mse-840000-ema-pruned.ckpt
  width: 512
  height: 512

And on startup, it will load whichever has default: true, e.g.

stable-diffusion-1.5:
    config:  configs/stable-diffusion/v1-inference.yaml
    weights: models/ldm/stable-diffusion-v1/v1-5-pruned-emaonly.ckpt
    description: Stable Diffusion inference model version 1.5
    vae: models/ldm/stable-diffusion-v1/vae-ft-mse-840000-ema-pruned.ckpt
    width: 512
    height: 512
    default: true

michaelezra commented 1 year ago

I used to explicitly specify, which model to load like this: python scripts/invoke.py --web --model stable-diffusion-1.5

now I added default: true to v1.5

stable-diffusion-1.4:
    config:  configs/stable-diffusion/v1-inference.yaml
    weights: models/ldm/stable-diffusion-v1/model.ckpt
    description: Stable Diffusion inference model version 1.4
    vae: models/ldm/stable-diffusion-v1/vae-ft-mse-840000-ema-pruned.ckpt
    width: 512
    height: 512
stable-diffusion-**1.5**:
    config:  configs/stable-diffusion/v1-inference.yaml
    weights: models/ldm/stable-diffusion-v1/v1-5-pruned-emaonly.ckpt
    description: Stable Diffusion inference model version 1.5
    vae: models/ldm/stable-diffusion-v1/vae-ft-mse-840000-ema-pruned.ckpt
    width: 512
    height: 512
    default: true

now it is loading v1.4 be default

python scripts/invoke.py --web


* Initializing, be patient...
NOTE: Redirects are currently not supported in Windows or MacOs.
> GFPGAN Initialized
> CodeFormer Initialized
> ESRGAN Initialized
> Using device_type mps
> Loading stable-diffusion-1.4 from models/ldm/stable-diffusion-v1/**model.ckpt**
| LatentDiffusion: Running in eps-prediction mode
| DiffusionWrapper has 859.52 M params.
| Making attention of type 'vanilla' with 512 in_channels
| Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
| Making attention of type 'vanilla' with 512 in_channels
| Using more accurate float32 precision
> Model loaded in 5.88s
> Setting Sampler to k_lms

--web was specified, starting web server...

Started Invoke AI Web Server! Default host address now 127.0.0.1 (localhost). Use --host 0.0.0.0 to bind any address. Point your browser at http://127.0.0.1:9090



still no luck with vae. I am using master branch, is that ok?

mauwii commented 1 year ago

you have 4 spaces in your yaml, I only 2, maybe this is the problem, dunno so I mean the whitespaces in front of description, weights, ....

stable-diffusion-1.5:
  description: The newest Stable Diffusion version 1.5 weight file (4.27 GB)
  weights: ./models/ldm/stable-diffusion-v1/v1-5-pruned-emaonly.ckpt
  config: ./configs/stable-diffusion/v1-inference.yaml
  width: 512
  height: 512
  vae: ./models/ldm/stable-diffusion-v1/vae-ft-mse-840000-ema-pruned.ckpt
  default: true

michaelezra commented 1 year ago

I copied your yaml file, but no difference:

Loading stable-diffusion-1.5 from ./models/ldm/stable-diffusion-v1/v1-5-pruned-emaonly.ckpt | LatentDiffusion: Running in eps-prediction mode | DiffusionWrapper has 859.52 M params. | Making attention of type 'vanilla' with 512 in_channels | Working with z of shape (1, 4, 32, 32) = 4096 dimensions. | Making attention of type 'vanilla' with 512 in_channels | Using more accurate float32 precision Model loaded in 5.87s

michaelezra commented 1 year ago

When I remove v1.5 from models.yaml

I am getting this error - seems like v1.4 is being defaulted from some place else

python scripts/invoke.py --web

Initializing, be patient... NOTE: Redirects are currently not supported in Windows or MacOs.

GFPGAN Initialized CodeFormer Initialized ESRGAN Initialized Using device_type mps "stable-diffusion-1.4" is not a known model name. Please check your models.yaml file Model switch failed **

michaelezra commented 1 year ago

args.py:

mauwii commented 1 year ago

could you maybe post a complete copy paste like this one:

python scripts/invoke.py --model stable-diffusion-1.5
* Initializing, be patient...
NOTE: Redirects are currently not supported in Windows or MacOs.
>> GFPGAN Initialized
>> CodeFormer Initialized
>> ESRGAN Initialized
>> Using device_type mps
>> Loading stable-diffusion-1.5 from models/ldm/stable-diffusion-v1/v1-5-pruned-emaonly.ckpt
   | LatentDiffusion: Running in eps-prediction mode
   | DiffusionWrapper has 859.52 M params.
   | Making attention of type 'vanilla' with 512 in_channels
   | Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
   | Making attention of type 'vanilla' with 512 in_channels
   | Using more accurate float32 precision
   | Loading VAE weights from: models/ldm/stable-diffusion-v1/vae-ft-mse-840000-ema-pruned.ckpt
>> Model loaded in 12.44s
>> Setting Sampler to k_lms

* Initialization done! Awaiting your command (-h for help, 'q' to quit)
invoke>

so from the command you enter to the invoke prompt. In the last one there is >> Loading stable-diffusion-1.4 from models/ldm/stable-diffusion-v1/**model.ckpt** which looks totaly weird. Also your paths never start with ./. Ok, Any winter's path's also dont start with ./

did you try to delete the conda env, clean your git branch and start fresh?

michaelezra commented 1 year ago

here it is:

python scripts/invoke.py --web --model stable-diffusion-1.5
* Initializing, be patient...
NOTE: Redirects are currently not supported in Windows or MacOs.
>> GFPGAN Initialized
>> CodeFormer Initialized
>> ESRGAN Initialized
>> Using device_type mps
>> Loading stable-diffusion-1.5 from models/ldm/stable-diffusion-v1/v1-5-pruned-emaonly.ckpt
   | LatentDiffusion: Running in eps-prediction mode
   | DiffusionWrapper has 859.52 M params.
   | Making attention of type 'vanilla' with 512 in_channels
   | Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
   | Making attention of type 'vanilla' with 512 in_channels
   | Using more accurate float32 precision
>> Model loaded in 5.72s
>> Setting Sampler to k_lms

* --web was specified, starting web server...
>> Started Invoke AI Web Server!
>> Default host address now 127.0.0.1 (localhost). Use --host 0.0.0.0 to bind any address.
>> Point your browser at http://127.0.0.1:9090

I didn't try to "delete the conda env, clean your git branch and start fresh". how would you clean git branch?

mauwii commented 1 year ago

https://invoke-ai.github.io/InvokeAI/installation/INSTALL_MAC/#doesnt-work-anymore

michaelezra commented 1 year ago

ok.. cleaned git, conda, reinstalled all.. and result is the same:)

mauwii commented 1 year ago

are you using the main branch, where this is functionality is not yet implemented 🤔

At least in this branch there are still those 4 whitespaces instead of 2 in the models.yaml, and when I use it I also just get:

python scripts/invoke.py --model stable-diffusion-1.5      
* Initializing, be patient...
NOTE: Redirects are currently not supported in Windows or MacOs.
>> GFPGAN Initialized
>> CodeFormer Initialized
>> ESRGAN Initialized
>> Using device_type mps
>> Loading stable-diffusion-1.5 from ./models/ldm/stable-diffusion-v1/v1-5-pruned-emaonly.ckpt
   | LatentDiffusion: Running in eps-prediction mode
   | DiffusionWrapper has 859.52 M params.
   | Making attention of type 'vanilla' with 512 in_channels
   | Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
   | Making attention of type 'vanilla' with 512 in_channels
   | Using more accurate float32 precision
>> Model loaded in 9.96s
>> Setting Sampler to k_lms

* Initialization done! Awaiting your command (-h for help, 'q' to quit)
invoke>

YAML was of course updated

michaelezra commented 1 year ago

oh...yes I am in the main branch:)) which branch should I be using instead to try vae?

mauwii commented 1 year ago

well, if you need to ask, you should maybe better wait for a release ;P

michaelezra commented 1 year ago

I switched to development branch.. do I need to recreate the environment again?

mauwii commented 1 year ago

@Any-Winter-4079 I re-deployed a 2 week old version of the docs where it wasnt that broken as now, just take a look at https://mauwii.github.io/stable-diffusion/installation/INSTALL_MAC/, when the script blocks startk there is another very nice functionality with those switchable tabs.

Or this site https://mauwii.github.io/stable-diffusion/features/PROMPTS/ is also something which could never look so good in github itself. But well, humanity is doomed to not be able to have nice things I guess 🤷🏻‍♂️

michaelezra commented 1 year ago

Houston, we have a victory!:)

python scripts/invoke.py --web --model stable-diffusion-1.5

Initializing, be patient... NOTE: Redirects are currently not supported in Windows or MacOs.

GFPGAN Initialized CodeFormer Initialized ESRGAN Initialized Using device_type mps Loading stable-diffusion-1.5 from models/ldm/stable-diffusion-v1/v1-5-pruned-emaonly.ckpt | LatentDiffusion: Running in eps-prediction mode | DiffusionWrapper has 859.52 M params. | Making attention of type 'vanilla' with 512 in_channels | Working with z of shape (1, 4, 32, 32) = 4096 dimensions. | Making attention of type 'vanilla' with 512 in_channels | Using more accurate float32 precision | Loading VAE weights from: models/ldm/stable-diffusion-v1/vae-ft-mse-840000-ema-pruned.ckpt Model loaded in 5.82s Setting Sampler to k_lms

Thanks for your help!

Any-Winter-4079 commented 1 year ago

@Any-Winter-4079 I re-deployed a 2 week old version of the docs where it wasnt that broken as now, just take a look at https://mauwii.github.io/stable-diffusion/installation/INSTALL_MAC/, when the script blocks startk there is another very nice functionality with those switchable tabs.

Or this site https://mauwii.github.io/stable-diffusion/features/PROMPTS/ is also something which could never look so good in github itself. But well, humanity is doomed to not be able to have nice things I guess 🤷🏻‍♂️

I think this is from https://github.com/invoke-ai/InvokeAI/issues/1278 🙈but yep, your docs look awesome! By the way, is there a link to the gh-pages on the main repo page? It might help people migrate to using gh-pages more (personally I'm just lazy to try to find the gh-page, hence why I almost always use gh-docs to read stuff)

mauwii commented 1 year ago

Edit: Lol - should first have clicked, then written xD

Every link in the main README.md points to it (like Installation, inpainting, outpainting, bob ross painting, ....) as well as a link in the top of the README.md and in the InfoBox of the of the Repo in the upper right corner, so not sure how many links would be necesarry 🤷🏻‍♂️

BTW: just try out the search bar in the MkDocs and you directly know why it is totaly imba vs plain old markdowns on GH ;P

Any-Winter-4079 commented 1 year ago

Oh my bad! Then it's fine 🙈 Odd as it is, I barely scroll down (I mainly use Issues, Pull Requests, etc. tabs at the top, and the code files, but I almost never scroll down to the readme) No wonder I didn't find it :P

mauwii commented 1 year ago

without having them cound, I guess the MkDocs Page is linked at least 20 times in the readme 🤗 But if you have any other good Ideas how to point to the site I (and 4sure @lstein as well) would apreciate it a lot 👻

lstein commented 1 year ago

@Any-Winter-4079, I'm hijacking this discussion thread just to check up. I haven't heard anything from you for about two weeks and wonder if you're still active on the project?

Any-Winter-4079 commented 1 year ago

Busy week (3 exams and 4 projects) here that ends in 2 days. Most I've done these days is leave Dreambooth running (in colab) in the background :) I'll probably come back to check where everything is at on Monday.

psychedelicious commented 1 year ago

I believe the technical issues were resolved here. Hope your exams went well @Any-Winter-4079 !

invoke-ai / InvokeAI

VAE (vae-ft-mse-840000-ema-pruned.ckpt) Documentation #1279