Closed Any-Winter-4079 closed 1 year ago
what where the prompts you used?
miranda kerr closeup -s 50 -S 2328329510 -W 512 -H 512 -C 7.5 -A k_lms
miranda kerr closeup -s 50 -S 1641815931 -W 512 -H 512 -C 7.5 -A k_lms
miranda kerr closeup -s 50 -S 1502365647 -W 512 -H 512 -C 7.5 -A k_lms
But I assume it generalizes to other prompts.
Would like to compare what they look on my end, since I did not have those problematic eyes.
Will test after MacOS Update is done ^^
Are there any instructions how to use it with InvokeAI?
not sure if it is related to the updated macOS Ventura, but when running in main branch I know have 5.s/it (instead of 2.s/it) 🤔
ok, also got weird eyes now xD
miranda kerr closeup -s 50 -S 2328329510 -W 512 -H 512 -C 7.5 -A k_lms
after re-downloading FFHQ_eye_mouth_landmarks_512.pth
and StyleGAN2_512_Cmul1_FFHQ_B12G4_scratch_800k.pth
to src/gfpgan/experiments/pretrained_models
it looks much better again 🙈
miranda kerr closeup" -s 50 -S 2328329510 -W 512 -H 512 -C 7.5 -A k_lms -G 0.7
Are there any instructions how to use it with InvokeAI?
@michaelezra In models.yaml
:
stable-diffusion-1.4:
config: configs/stable-diffusion/v1-inference.yaml
weights: models/ldm/stable-diffusion-v1/model.ckpt
description: Stable Diffusion inference model version 1.4
vae: models/ldm/stable-diffusion-v1/vae-ft-mse-840000-ema-pruned.ckpt
width: 512
height: 512
add vae to the model(s). Download: https://huggingface.co/stabilityai/sd-vae-ft-mse-original/tree/main
@mauwii Interesting. I didn't use FFHQ_eye_mouth_landmarks_512.pth
or StyleGAN2_512_Cmul1_FFHQ_B12G4_scratch_800k.pth
for any of the results posted above, but it seems to achieve a similar purpose as the new variational auto encoder: vae-ft-mse-840000-ema-pruned.ckpt
.
One thing. When you run miranda kerr closeup -s 50 -S 2328329510 -W 512 -H 512 -C 7.5 -A k_lms
GFPGAN is not passed (nor are you using the new variational auto encoder), so it's normal to see the poor-looking eyes (I think).
after re-downloading FFHQ_eye_mouth_landmarks_512.pth and StyleGAN2_512_Cmul1_FFHQ_B12G4_scratch_800k.pth to src/gfpgan/experiments/pretrained_modelsit looks much better again 🙈
I assume after doing this, your command changed and you passed some post-processing options? e.g.
miranda kerr closeup -s 50 -S 2328329510 -W 512 -H 512 -C 7.5 -A k_lms -G 0.6 -ft gfpgan
Some comparison w/ all possibilities using GFPGAN / VAE:
I assume after doing this, your command changed and you passed some post-processing options? e.g.
miranda kerr closeup -s 50 -S 2328329510 -W 512 -H 512 -C 7.5 -A k_lms -G 0.6 -ft gfpgan
Oh, yes, sorry, should have mentioned this. What happened: When I created the first picture i was totally iritated since my GFPGAN was broken as well so even postprocessing wasnt working anymore (which I guess came from about 100 reinstallations while "playing" with the environment-mac.yml 🙈).
Long story short - the second picture was postprocessed with GFPGAN 🤦🏻♂️ Hope I did not lead you on some non-existing route!
All good! I figured you were passing GFPGAN to get so good of a result :) Here's another comparison:
I edited the post and added the prompt miranda kerr closeup" -s 50 -S 2328329510 -W 512 -H 512 -C 7.5 -A k_lms -G 0.7
which should be the one I used there
Do you think it may be useful to include these results / comparisons in https://github.com/invoke-ai/InvokeAI/blob/development/docs/features/POSTPROCESS.md? There's no photos right now that document what one may expect.
For example, using GFPGAN does help, but it may be useful to use with the new VAE. See in https://github.com/invoke-ai/InvokeAI/issues/1279#issuecomment-1295832189 how GFPGAN-only eyes are a bit more blurry than VAE-only eyes and VAE-GFPGAN eyes.
Note: There's the option of increasing -G (face restoration strength), to try to improve the eyes, but it will also smoothen details in the face, and you can't decouple it from eyes-restoration-only as far as I know. So combining GFPGAN with VAE may be a nice idea (e.g. say you want wrinkles in the face for an old person but also good eyes).
I guess your excelent comparison-screenshots would ofc fit very well onto the postprocessing docs, since Pics (especially in this case) sometimes say more than 100 lines of docs could do 🙈
@Any-Winter-4079 Thanks!
in models.yaml I tried v1.5:
stable-diffusion-1.5:
config: configs/stable-diffusion/v1-inference.yaml
weights: models/ldm/stable-diffusion-v1/v1-5-pruned-emaonly.ckpt
description: Stable Diffusion inference model version 1.5
vae: models/ldm/stable-diffusion-v1/vae-ft-mse-840000-ema-pruned.ckpt
width: 512
height: 512
I created a symlink to vae-ft-mse-840000-ema-pruned.ckpt and starting invokeai:
`(invokeai) michaelezra@Michaels-MBP stable-diffusion % PATH_TO_CKPT="/Users/michaelezra/ai/CKPT"
(invokeai) michaelezra@Michaels-MBP stable-diffusion % ln -s "$PATH_TO_CKPT/vae-ft-mse-840000-ema-pruned.ckpt" models/ldm/stable-diffusion-v1/vae-ft-mse-840000-ema-pruned.ckpt
(invokeai) michaelezra@Michaels-MBP stable-diffusion % python scripts/invoke.py --web --model stable-diffusion-1.5
GFPGAN Initialized CodeFormer Initialized ESRGAN Initialized Using device_type mps Loading stable-diffusion-1.5 from models/ldm/stable-diffusion-v1/v1-5-pruned-emaonly.ckpt | LatentDiffusion: Running in eps-prediction mode | DiffusionWrapper has 859.52 M params. | Making attention of type 'vanilla' with 512 in_channels | Working with z of shape (1, 4, 32, 32) = 4096 dimensions. | Making attention of type 'vanilla' with 512 in_channels | Using more accurate float32 precision Model loaded in 5.93s Setting Sampler to k_lms`
Does this look right - no message about vae. Should I use any additional command line parameter for invoke.py?
Do you think it may be useful to include these results / comparisons in https://github.com/invoke-ai/InvokeAI/blob/development/docs/features/POSTPROCESS.md? There's no photos right now that document what one may expect.
Totaly agree. I can maybe give you a good advice about how to edit the docs comfortably:
# while in repository-root
python -m venv .venv
. .venv/bin/activate
pip install -r requirements-mkdocs.txt
mkdocs serve
this spins up a local version of the github-page so that you have a live preview (which also reloads as soon as you save some changes).
I used it so often that in the meantime I added the mkdocs requirements to my conda base env 😅
For example, using GFPGAN does help, but it may be useful to use with the new VAE. See in #1279 (comment) how GFPGAN-only eyes are a bit more blurry than VAE-only eyes and VAE-GFPGAN eyes.
Note: There's the option of increasing -G (face restoration strength), to try to improve the eyes, but it will also smoothen details in the face, and you can't decouple it from eyes-restoration-only as far as I know. So combining GFPGAN with VAE may be a nice idea (e.g. say you want wrinkles in the face for an old person but also good eyes).
Yeah, I also noted that "blurrynes". I mostly used a value between 6-7.5 with GFPGAN, which varies depending on the picture itself, but here it is recommended even between 5-8, in the end it's also kind of a personal flavor I guess.
Combining the postprocessors Sounds legit, but unfortunatelly I didn't dive that deep into it yet 🙈 (I even did never use the whole stuff like inpainting, outpainting, ..., but the day will come ;P )
@michaelezra: since I just want to try this as well, this is how it looks for me:
models.yml:
stable-diffusion-1.5:
config: configs/stable-diffusion/v1-inference.yaml
weights: models/ldm/stable-diffusion-v1/v1-5-pruned-emaonly.ckpt
vae: models/ldm/stable-diffusion-v1/vae-ft-mse-840000-ema-pruned.ckpt
description: Stable Diffusion inference model version 1.5
width: 512
height: 512
running invoke.py:
python scripts/invoke.py --model stable-diffusion-1.5
* Initializing, be patient...
NOTE: Redirects are currently not supported in Windows or MacOs.
>> GFPGAN Initialized
>> CodeFormer Initialized
>> ESRGAN Initialized
>> Using device_type mps
>> Loading stable-diffusion-1.5 from models/ldm/stable-diffusion-v1/v1-5-pruned-emaonly.ckpt
| LatentDiffusion: Running in eps-prediction mode
| DiffusionWrapper has 859.52 M params.
| Making attention of type 'vanilla' with 512 in_channels
| Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
| Making attention of type 'vanilla' with 512 in_channels
| Using more accurate float32 precision
| Loading VAE weights from: models/ldm/stable-diffusion-v1/vae-ft-mse-840000-ema-pruned.ckpt
>> Model loaded in 12.44s
>> Setting Sampler to k_lms
* Initialization done! Awaiting your command (-h for help, 'q' to quit)
invoke>
VAE weights are also "only" symlinked to models/ldm/stable-diffusion-v1/vae-ft-mse-840000-ema-pruned.ckpt
I created a symlink to vae-ft-mse-840000-ema-pruned.ckpt
I just downloaded the file and put it into models/ldm/stable-diffusion-v1, so I don't know what the issue may be. If a symlink doesn't work in your setup, try putting the ckpt directly in the folder.
strange.. I am still not getting this, after placing the file directly and changing the order in models.yaml, putting vae: after weights:
Loading VAE weights from: models/ldm/stable-diffusion-v1/vae-ft-mse-840000-ema-pruned.ckpt
So 1.5 is the default model you load?
I don't see default: true
here
stable-diffusion-1.5:
config: configs/stable-diffusion/v1-inference.yaml
weights: models/ldm/stable-diffusion-v1/v1-5-pruned-emaonly.ckpt
description: Stable Diffusion inference model version 1.5
vae: models/ldm/stable-diffusion-v1/vae-ft-mse-840000-ema-pruned.ckpt
width: 512
height: 512
so unless you switch models, it shouldn't load the vae ckpt.
Basically, for every model you want to use this vae with, you have to add the vae line, e.g.
waifu-1.3:
config: configs/stable-diffusion/v1-inference.yaml
weights: models/ldm/waifu-diffusion/inference/wd-v1-3-float32.ckpt
description: Waifu Diffusion inference model version 1.3
vae: models/ldm/stable-diffusion-v1/vae-ft-mse-840000-ema-pruned.ckpt
width: 512
height: 512
waifu-1.3-full:
config: configs/stable-diffusion/v1-inference.yaml
weights: models/ldm/waifu-diffusion/inference/wd-v1-3-full.ckpt
description: Waifu Diffusion inference model version 1.3 (full)
vae: models/ldm/stable-diffusion-v1/vae-ft-mse-840000-ema-pruned.ckpt
width: 512
height: 512
And on startup, it will load whichever has default: true, e.g.
stable-diffusion-1.5:
config: configs/stable-diffusion/v1-inference.yaml
weights: models/ldm/stable-diffusion-v1/v1-5-pruned-emaonly.ckpt
description: Stable Diffusion inference model version 1.5
vae: models/ldm/stable-diffusion-v1/vae-ft-mse-840000-ema-pruned.ckpt
width: 512
height: 512
default: true
I used to explicitly specify, which model to load like this: python scripts/invoke.py --web --model stable-diffusion-1.5
now I added default: true to v1.5
stable-diffusion-1.4:
config: configs/stable-diffusion/v1-inference.yaml
weights: models/ldm/stable-diffusion-v1/model.ckpt
description: Stable Diffusion inference model version 1.4
vae: models/ldm/stable-diffusion-v1/vae-ft-mse-840000-ema-pruned.ckpt
width: 512
height: 512
stable-diffusion-**1.5**:
config: configs/stable-diffusion/v1-inference.yaml
weights: models/ldm/stable-diffusion-v1/v1-5-pruned-emaonly.ckpt
description: Stable Diffusion inference model version 1.5
vae: models/ldm/stable-diffusion-v1/vae-ft-mse-840000-ema-pruned.ckpt
width: 512
height: 512
default: true
now it is loading v1.4 be default
python scripts/invoke.py --web
* Initializing, be patient... NOTE: Redirects are currently not supported in Windows or MacOs. > GFPGAN Initialized > CodeFormer Initialized > ESRGAN Initialized > Using device_type mps > Loading stable-diffusion-1.4 from models/ldm/stable-diffusion-v1/**model.ckpt** | LatentDiffusion: Running in eps-prediction mode | DiffusionWrapper has 859.52 M params. | Making attention of type 'vanilla' with 512 in_channels | Working with z of shape (1, 4, 32, 32) = 4096 dimensions. | Making attention of type 'vanilla' with 512 in_channels | Using more accurate float32 precision > Model loaded in 5.88s > Setting Sampler to k_lms
Started Invoke AI Web Server! Default host address now 127.0.0.1 (localhost). Use --host 0.0.0.0 to bind any address. Point your browser at http://127.0.0.1:9090
still no luck with vae. I am using master branch, is that ok?
you have 4 spaces in your yaml, I only 2, maybe this is the problem, dunno so I mean the whitespaces in front of description, weights, ....
stable-diffusion-1.5:
description: The newest Stable Diffusion version 1.5 weight file (4.27 GB)
weights: ./models/ldm/stable-diffusion-v1/v1-5-pruned-emaonly.ckpt
config: ./configs/stable-diffusion/v1-inference.yaml
width: 512
height: 512
vae: ./models/ldm/stable-diffusion-v1/vae-ft-mse-840000-ema-pruned.ckpt
default: true
I copied your yaml file, but no difference:
Loading stable-diffusion-1.5 from ./models/ldm/stable-diffusion-v1/v1-5-pruned-emaonly.ckpt | LatentDiffusion: Running in eps-prediction mode | DiffusionWrapper has 859.52 M params. | Making attention of type 'vanilla' with 512 in_channels | Working with z of shape (1, 4, 32, 32) = 4096 dimensions. | Making attention of type 'vanilla' with 512 in_channels | Using more accurate float32 precision Model loaded in 5.87s
When I remove v1.5 from models.yaml
I am getting this error - seems like v1.4 is being defaulted from some place else
python scripts/invoke.py --web
GFPGAN Initialized CodeFormer Initialized ESRGAN Initialized Using device_type mps "stable-diffusion-1.4" is not a known model name. Please check your models.yaml file Model switch failed **
args.py:
could you maybe post a complete copy paste like this one:
python scripts/invoke.py --model stable-diffusion-1.5
* Initializing, be patient...
NOTE: Redirects are currently not supported in Windows or MacOs.
>> GFPGAN Initialized
>> CodeFormer Initialized
>> ESRGAN Initialized
>> Using device_type mps
>> Loading stable-diffusion-1.5 from models/ldm/stable-diffusion-v1/v1-5-pruned-emaonly.ckpt
| LatentDiffusion: Running in eps-prediction mode
| DiffusionWrapper has 859.52 M params.
| Making attention of type 'vanilla' with 512 in_channels
| Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
| Making attention of type 'vanilla' with 512 in_channels
| Using more accurate float32 precision
| Loading VAE weights from: models/ldm/stable-diffusion-v1/vae-ft-mse-840000-ema-pruned.ckpt
>> Model loaded in 12.44s
>> Setting Sampler to k_lms
* Initialization done! Awaiting your command (-h for help, 'q' to quit)
invoke>
so from the command you enter to the invoke prompt. In the last one there is >> Loading stable-diffusion-1.4 from models/ldm/stable-diffusion-v1/**model.ckpt**
which looks totaly weird. Also your paths never start with ./
. Ok, Any winter's path's also dont start with ./
did you try to delete the conda env, clean your git branch and start fresh?
here it is:
python scripts/invoke.py --web --model stable-diffusion-1.5
* Initializing, be patient...
NOTE: Redirects are currently not supported in Windows or MacOs.
>> GFPGAN Initialized
>> CodeFormer Initialized
>> ESRGAN Initialized
>> Using device_type mps
>> Loading stable-diffusion-1.5 from models/ldm/stable-diffusion-v1/v1-5-pruned-emaonly.ckpt
| LatentDiffusion: Running in eps-prediction mode
| DiffusionWrapper has 859.52 M params.
| Making attention of type 'vanilla' with 512 in_channels
| Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
| Making attention of type 'vanilla' with 512 in_channels
| Using more accurate float32 precision
>> Model loaded in 5.72s
>> Setting Sampler to k_lms
* --web was specified, starting web server...
>> Started Invoke AI Web Server!
>> Default host address now 127.0.0.1 (localhost). Use --host 0.0.0.0 to bind any address.
>> Point your browser at http://127.0.0.1:9090
I didn't try to "delete the conda env, clean your git branch and start fresh". how would you clean git branch?
ok.. cleaned git, conda, reinstalled all.. and result is the same:)
are you using the main branch, where this is functionality is not yet implemented 🤔
At least in this branch there are still those 4 whitespaces instead of 2 in the models.yaml, and when I use it I also just get:
python scripts/invoke.py --model stable-diffusion-1.5
* Initializing, be patient...
NOTE: Redirects are currently not supported in Windows or MacOs.
>> GFPGAN Initialized
>> CodeFormer Initialized
>> ESRGAN Initialized
>> Using device_type mps
>> Loading stable-diffusion-1.5 from ./models/ldm/stable-diffusion-v1/v1-5-pruned-emaonly.ckpt
| LatentDiffusion: Running in eps-prediction mode
| DiffusionWrapper has 859.52 M params.
| Making attention of type 'vanilla' with 512 in_channels
| Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
| Making attention of type 'vanilla' with 512 in_channels
| Using more accurate float32 precision
>> Model loaded in 9.96s
>> Setting Sampler to k_lms
* Initialization done! Awaiting your command (-h for help, 'q' to quit)
invoke>
YAML was of course updated
oh...yes I am in the main branch:)) which branch should I be using instead to try vae?
well, if you need to ask, you should maybe better wait for a release ;P
I switched to development branch.. do I need to recreate the environment again?
@Any-Winter-4079 I re-deployed a 2 week old version of the docs where it wasnt that broken as now, just take a look at https://mauwii.github.io/stable-diffusion/installation/INSTALL_MAC/, when the script blocks startk there is another very nice functionality with those switchable tabs.
Or this site https://mauwii.github.io/stable-diffusion/features/PROMPTS/ is also something which could never look so good in github itself. But well, humanity is doomed to not be able to have nice things I guess 🤷🏻♂️
Houston, we have a victory!:)
python scripts/invoke.py --web --model stable-diffusion-1.5
GFPGAN Initialized CodeFormer Initialized ESRGAN Initialized Using device_type mps Loading stable-diffusion-1.5 from models/ldm/stable-diffusion-v1/v1-5-pruned-emaonly.ckpt | LatentDiffusion: Running in eps-prediction mode | DiffusionWrapper has 859.52 M params. | Making attention of type 'vanilla' with 512 in_channels | Working with z of shape (1, 4, 32, 32) = 4096 dimensions. | Making attention of type 'vanilla' with 512 in_channels | Using more accurate float32 precision | Loading VAE weights from: models/ldm/stable-diffusion-v1/vae-ft-mse-840000-ema-pruned.ckpt Model loaded in 5.82s Setting Sampler to k_lms
Thanks for your help!
@Any-Winter-4079 I re-deployed a 2 week old version of the docs where it wasnt that broken as now, just take a look at https://mauwii.github.io/stable-diffusion/installation/INSTALL_MAC/, when the script blocks startk there is another very nice functionality with those switchable tabs.
Or this site https://mauwii.github.io/stable-diffusion/features/PROMPTS/ is also something which could never look so good in github itself. But well, humanity is doomed to not be able to have nice things I guess 🤷🏻♂️
I think this is from https://github.com/invoke-ai/InvokeAI/issues/1278 🙈but yep, your docs look awesome! By the way, is there a link to the gh-pages on the main repo page? It might help people migrate to using gh-pages more (personally I'm just lazy to try to find the gh-page, hence why I almost always use gh-docs to read stuff)
Edit: Lol - should first have clicked, then written xD
Every link in the main README.md points to it (like Installation, inpainting, outpainting, bob ross painting, ....) as well as a link in the top of the README.md and in the InfoBox of the of the Repo in the upper right corner, so not sure how many links would be necesarry 🤷🏻♂️
BTW: just try out the search bar in the MkDocs and you directly know why it is totaly imba vs plain old markdowns on GH ;P
Oh my bad! Then it's fine 🙈 Odd as it is, I barely scroll down (I mainly use Issues, Pull Requests, etc. tabs at the top, and the code files, but I almost never scroll down to the readme) No wonder I didn't find it :P
without having them cound, I guess the MkDocs Page is linked at least 20 times in the readme 🤗 But if you have any other good Ideas how to point to the site I (and 4sure @lstein as well) would apreciate it a lot 👻
@Any-Winter-4079, I'm hijacking this discussion thread just to check up. I haven't heard anything from you for about two weeks and wonder if you're still active on the project?
Busy week (3 exams and 4 projects) here that ends in 2 days. Most I've done these days is leave Dreambooth running (in colab) in the background :) I'll probably come back to check where everything is at on Monday.
I believe the technical issues were resolved here. Hope your exams went well @Any-Winter-4079 !
Finally got to try
vae-ft-mse-840000-ema-pruned.ckpt
and my first impression is wow: seems to do better with eyes! Have you noticed some other benefits (e.g. for landscapes, etc.)? I think it would be good to document the strengths and weaknesses of this new vae and also add it to the docs.