Compare argument list with SD vanilla repo

invoke-ai / InvokeAI

Invoke is a leading creative engine for Stable Diffusion models, empowering professionals, artists, and enthusiasts to generate and create visual media using the latest AI-driven technologies. The solution offers an industry leading WebUI, and serves as the foundation for multiple commercial products.

https://invoke-ai.github.io/InvokeAI/

Apache License 2.0

23.64k stars 2.43k forks source link

Compare argument list with SD vanilla repo #479

Closed Neosettler closed 2 years ago

Neosettler commented 2 years ago

Greetings,

I'm in the process of migrating from the SD vanilla repo to this one. I have a few questions regarding the arguments/options:

Some argument are duplicated, once at initialization and once for the main loop. That's confusing.

-sampler is set during initialization, is that necessary since it can be set in the main loop?

-full_precision is set only at initialization, there is none for the main loop. If changed after initialization, does this means we must reload the model for precision to take effect?

-n_sample I had a hard time figuring out how "n_sample" and "n_iter" differed before. It seems to have kept only "iterations" now, could anyone clear this up? Why is n_sample not needed?

-model we used to be able to pass the model file, it was very handy to swap model very quickly. Now, there is a bunch of hard coded stuff to be cooked like the name of the model files and some config madness. What happened here?

Any insight would be appreciated, Thank you,

blessedcoolant commented 2 years ago

-sampler is set during initialization, is that necessary since it can be set in the main loop?

Yes. Because this repo allows you to change sampler on the fly for each prompt without needing to restart the entire application.

The init argument is for boot and the prompt argument is for prompts. If you don't se a prompt argument, it'll default to your init argument.

-full_precision is set only at initialization, there is none for the main loop. If changed after initialization, does this means we must reload the model for precision to take effect?

Yes.

-n_sample I had a hard time figuring out how "n_sample" and "n_iter" differed before. It seems to have kept only "iterations" now, could anyone clear this up? Why is n_sample not needed

Because most systems were completely incapable of creating multiple samples in the same batch. Iterations performed the same job while still providing accurate seeds for each generated image and actually being able to be run across systems. And because this repo runs per prompt, you don't really ever find the need to use batch. And on top of that, the programming difficulties batch size was creating, we decided to get rid of it.

You can just use iterations and you will be perfectly fine.

-model we used to be able to pass the model file, it was very handy to swap model very quickly. Now, there is a bunch of hard coded stuff to be cooked like the name of the model files and some config madness. What happened here?

This is a lot more organized and streamlined now. You can setup your model once in the config/models.yaml and then use the -model argument to switch between them. If you don't want to set them up, you can still use the --weights to override everything and load like you did before.

Neosettler commented 2 years ago

Thank you for your support blessedcoolant.

Mind me asking for a use case example with --weights?

Note: It seems like it as been deprecated the hard way.

blessedcoolant commented 2 years ago

Note: It seems like it as been deprecated the hard way.

Yes. It has been. My bad. So much to keep track of.

So I'd really recommend configuring models.yaml. Takes a few seconds and allows for easy switching that you are looking for.

Neosettler commented 2 years ago

1.5 is coming soon, it seems like adding a new model is not trivial... will it set us back waiting for implementation from this repo?

blessedcoolant commented 2 years ago

1.5 is coming soon, it seems like adding a new model is not trivial... will it set us back waiting for implementation from this repo?

Implementing a new model would take seconds -- literally. As long as there are no architectural changes which 1.5 most likely will not have. So the time you really have to wait is for someone to add those few lines of code and push the button.

But even better, you don't need to wait for anyone. You can just copy paste the existing few lines in model.yaml, change 1.4 to 1.5 and you'll be good to go yourself. Actually, I think even replacing the older model.ckpt with the new one will do the trick too without any other required changes.

What you really need help for is when a massive upgrade happens. At that point, there's a TON of active devs on this repo every day upgrading things asap.

lstein commented 2 years ago

Exactly as @blessedcoolant says. Upgrading to the 1.5 weights should be a snap. We'll post step-by-step instructions here.

Neosettler commented 2 years ago

Thank you so much for your inputs,

Being in the midst of connecting every single parameters to UI components. For clarity sake, my suggestion would be to add an argument category='checkpoint'|'GFPGAN'|ETC.

parser.add_argument(
    '-F',
    '--full_precision',
    category='checkpoint',
    dest='full_precision',
    action='store_true',
    help='Use more memory-intensive full precision math for calculations',
)

Further more, add "static" or "dynamic" visibility to inform that modifying a parameter would be taken into account only after reloading the entire model. That is VERY IMPORTANT to know!

parser.add_argument(
    '-F',
    '--full_precision',
    category='checkpoint',
    static='yes',
    dest='full_precision',
    action='store_true',
    help='Use more memory-intensive full precision math for calculations',
)

Implementing a new model would take seconds

Even if it takes seconds, it still seems like an unnecessary process. Could I make a feature request to keep the -weights option available!?

Neosettler commented 2 years ago

So I'd really recommend configuring models.yaml. Takes a few seconds and allows for easy switching that you are looking for.

I have yet to understand why config files are necessary. One would think that config files could be generated by parsing the checkpoint directly. If there is no other way around this, consider keeping the official naming convention (naming all checkpoints model.ckpt have major draw backs):

stable-diffusion-v1 sd-v1-1.ckpt sd-v1-1-full-ema.ckpt sd-v1-2.ckpt sd-v1-2-full-ema.ckpt sd-v1-3.ckpt sd-v1-3-full-ema.ckpt sd-v1-4.ckpt sd-v1-4-full-ema.ckpt sd-v1-5.ckpt (soon) sd-v1-5-full-ema.ckpt (soon)

Maybe adding config files for every other known/compatible checkpoint that exists to date, if any?

Neosettler commented 2 years ago

If someone could explain how the config file's data are being populated, I might just do it if that is an options.

blessedcoolant commented 2 years ago

If someone could explain how the config file's data are being populated, I might just do it if that is an options.

It's pretty straightfoward.

stable-diffusion-1.4: # name of the model you want to use 
    config:  configs/stable-diffusion/v1-inference.yaml # location of the config file for the model that is supplied with the code
    weights: models/ldm/stable-diffusion-v1/model.ckpt # location of the checkpoints for that model
    width: 512 # the width at which the model was trained
    height: 512 # the height at which the model was trained

Neosettler commented 2 years ago

Passing the path of the check point seems much more elegant. I don't understand why the config files are needed. Is it possible to find and old version of the repo where the -weights option was implemented?

blessedcoolant commented 2 years ago

Passing the path of the check point seems much more elegant. I don't understand why the config files are needed.

Because it is a lot cleaner. Set it up once and it's good to go.

Is it possible to find and old version of the repo where the -weights option was implemented?

Yes. You can find older releases on this repo. I wouldn't recommend using them though because so many bugs have been fixed since then and numerous features have been added.

This a very simple and straight forward change that actually streamlines the process. I suggest you make your peace with it.

Neosettler commented 2 years ago

Bear in mind that I'm not trying to be combative here. I'm willing to integrate all models but it's still unclear how to populating the config files.

Current SD model config have been put into a folder named "v1-inference", should all configs for the models below put in there?

sd-v1-1.ckpt sd-v1-1-full-ema.ckpt sd-v1-2.ckpt sd-v1-2-full-ema.ckpt sd-v1-3.ckpt sd-v1-3-full-ema.ckpt sd-v1-4.ckpt sd-v1-4-full-ema.ckpt sd-v1-5.ckpt (soon) sd-v1-5-full-ema.ckpt (soon)

You can setup your model once in the config/models.yaml and then use the -model argument to switch between them.

How would you go about making the configs attached to the checkpoints?

v1-inference.yaml:

model:
  base_learning_rate: 1.0e-04
  target: ldm.models.diffusion.ddpm.LatentDiffusion
  params:
    linear_start: 0.00085
    linear_end: 0.0120
    num_timesteps_cond: 1
    log_every_t: 200
    timesteps: 1000
    first_stage_key: "jpg"
    cond_stage_key: "txt"
    image_size: 64
    channels: 4
    cond_stage_trainable: false   # Note: different from the one we trained before
    conditioning_key: crossattn
    monitor: val/loss_simple_ema
    scale_factor: 0.18215
    use_ema: False

    scheduler_config: # 10000 warmup steps
      target: ldm.lr_scheduler.LambdaLinearScheduler
      params:
        warm_up_steps: [ 10000 ]
        cycle_lengths: [ 10000000000000 ] # incredibly large number to prevent corner cases
        f_start: [ 1.e-6 ]
        f_max: [ 1. ]
        f_min: [ 1. ]

    personalization_config:
      target: ldm.modules.embedding_manager.EmbeddingManager
      params:
        placeholder_strings: ["*"]
        initializer_words: ["sculpture"]
        per_image_tokens: false
        num_vectors_per_token: 1
        progressive_words: False

    unet_config:
      target: ldm.modules.diffusionmodules.openaimodel.UNetModel
      params:
        image_size: 32 # unused
        in_channels: 4
        out_channels: 4
        model_channels: 320
        attention_resolutions: [ 4, 2, 1 ]
        num_res_blocks: 2
        channel_mult: [ 1, 2, 4, 4 ]
        num_heads: 8
        use_spatial_transformer: True
        transformer_depth: 1
        context_dim: 768
        use_checkpoint: True
        legacy: False

    first_stage_config:
      target: ldm.models.autoencoder.AutoencoderKL
      params:
        embed_dim: 4
        monitor: val/rec_loss
        ddconfig:
          double_z: true
          z_channels: 4
          resolution: 256
          in_channels: 3
          out_ch: 3
          ch: 128
          ch_mult:
          - 1
          - 2
          - 4
          - 4
          num_res_blocks: 2
          attn_resolutions: []
          dropout: 0.0
        lossconfig:
          target: torch.nn.Identity

    cond_stage_config:
      target: ldm.modules.encoders.modules.FrozenCLIPEmbedder

How are those values populated? Are they relevant to generating images?

maddavid123 commented 2 years ago

I believe that most (all?) versions of stable diffusion use the same inferencing config. Other diffusion models may use a different config. For example, Latent Diffusion uses a different config, as you can see in the models.yaml.

For the versions of Stable Diffusion you desire to make entries in models.yaml, sd-v1-1 through sd-v1-5, the only thing that does need to change is the model. That does lead credibility to the desired usecase of using --weights.

For example, I recently downloaded waifu-diffusion, (A stable-diffusion model further trained on the Danbooru2021 dataset). Getting it to work with this repo was as simple as adding the following to models.yaml:

waifu-diffusion-1.2: # name of the model you want to use 
    config:  configs/stable-diffusion/v1-inference.yaml
    weights: models/ldm/waifu-diffusion-v1/WD12.ckpt 
    width: 512 # the width at which the model was trained
    height: 512 # the height at which the model was trained

and then on the command line: python scripts/dream.py --model waifu-diffusion-1.2

Admittedly, the reason it was so easy was because waifu-diffusion is just an advancement on stable-diffusion 1.4. Keep in mind however, there is no guarantee that the config will be the same for other diffusion models.

Neosettler commented 2 years ago

Easy then:

laion400m:
    config:  configs/latent-diffusion/txt2img-1p4B-eval.yaml
    weights: models/ldm/text2img-large/model.ckpt
    width: 256
    height: 256
stable-diffusion-1.1:
    config:  configs/stable-diffusion/v1-inference.yaml
    weights: models/ldm/stable-diffusion-v1/sd-v1-1.ckpt
    width: 512
    height: 512
stable-diffusion-1.1-full-ema:
    config:  configs/stable-diffusion/v1-inference.yaml
    weights: models/ldm/stable-diffusion-v1/sd-v1-1-full-ema.ckpt
    width: 512
    height: 512
stable-diffusion-1.2:
    config:  configs/stable-diffusion/v1-inference.yaml
    weights: models/ldm/stable-diffusion-v1/sd-v1-2.ckpt
    width: 512
    height: 512
stable-diffusion-1.2-full-ema:
    config:  configs/stable-diffusion/v1-inference.yaml
    weights: models/ldm/stable-diffusion-v1/sd-v1-2-full-ema.ckpt
    width: 512
    height: 512
stable-diffusion-1.3:
    config:  configs/stable-diffusion/v1-inference.yaml
    weights: models/ldm/stable-diffusion-v1/sd-v1-3.ckpt
    width: 512
    height: 512
stable-diffusion-1.3-full-ema:
    config:  configs/stable-diffusion/v1-inference.yaml
    weights: models/ldm/stable-diffusion-v1/sd-v1-3-full-ema.ckpt
    width: 512
    height: 512
stable-diffusion-1.4:
    config:  configs/stable-diffusion/v1-inference.yaml
    weights: models/ldm/stable-diffusion-v1/sd-v1-4.ckpt
    width: 512
    height: 512
stable-diffusion-1.4-full-ema:
    config:  configs/stable-diffusion/v1-inference.yaml
    weights: models/ldm/stable-diffusion-v1/sd-v1-4-full-ema.ckpt
    width: 512
    height: 512
stable-diffusion-1.5:
    config:  configs/stable-diffusion/v1-inference.yaml
    weights: models/ldm/stable-diffusion-v1/sd-v1-5.ckpt
    width: 512
    height: 512
stable-diffusion-1.5-full-ema:
    config:  configs/stable-diffusion/v1-inference.yaml
    weights: models/ldm/stable-diffusion-v1/sd-v1-5-full-ema.ckpt
    width: 512
    height: 512

Thank you mad, very good input. I'm still curious why are those config files needed. I guess I should rephrase the question: Is the confg values necessary to generate images and make the current system work or are they informative only?

maddavid123 commented 2 years ago

The configs themselves are an important part of the system. As far as I can tell, parts of the model get launched from within the config.

Take a look: There's a target, which is associated with a file in the structure, and params it loads that target with, the params specified.

target: ldm.models.diffusion.ddpm.LatentDiffusion
params:
    linear_start: 0.00085
    linear_end: 0.0120
    num_timesteps_cond: 1
    log_every_t: 200
    timesteps: 1000
    first_stage_key: "jpg"
    cond_stage_key: "txt"
    image_size: 64
    channels: 4
    cond_stage_trainable: false   # Note: different from the one we trained before
    conditioning_key: crossattn
    monitor: val/loss_simple_ema
    scale_factor: 0.18215
    use_ema: False
    ...

For example, loads ldm.models.diffusion.ddpm.py and more specifically, the class LatentDiffusion. You can see that the values in the params section are consumed here. (I think I also see some of those values being used in the DDPM class, too.)

So it definitely looks important!

Neosettler commented 2 years ago

Marvelous, thank you mad. It does raise the question of why those config files where not necessary while using the -weights option. Maybe the values were hard coded to handle only a specific type of model? Plus, it seems that there is not many compatible models yet to justify that config mechanic. In any cases, I'm curious to know how those values where populated.

maddavid123 commented 2 years ago

If you look at the original CompVis repo, you can see that there's an opt-in for --laion400m. This was the only way to switch off of using the default model.ckpt and config.

--weights, introduced in https://github.com/lstein/stable-diffusion/commit/ed513397b255868a9c0afe6dd7e580005b5d32bb, was mutually exclusive with --laion400m. If you used --laion400m, you wouldn't use --weights, as --laion400m overrode it. --laion400m forcefully specified the model ckpt path and config for latent diffusion.

if opt.laion400m:
        # defaults suitable to the older latent diffusion weights
        width = 256
        height = 256
        config = 'configs/latent-diffusion/txt2img-1p4B-eval.yaml'
        weights = 'models/ldm/text2img-large/model.ckpt'

If, otherwise, you used --weights (or let it use the default), it would use the same stable diffusion config (inference.yaml), regardless of which weights file you input — which we know does indeed work for existing stable diffusion models.

else:
        # some defaults suitable for stable diffusion weights
        width = 512
        height = 512
        config = 'configs/stable-diffusion/v1-inference.yaml'
        if '.ckpt' in opt.weights:
            weights = opt.weights
        else:
            weights = f'models/ldm/stable-diffusion-v1/{opt.weights}.ckpt'

--model, introduced in https://github.com/lstein/stable-diffusion/commit/d319b8a76238a79177c14621a1ff4ec9be5100be, merges the logic. with --model, you can specify exactly which model (config, weights, and training sizes) you wish to launch with.

models = OmegaConf.load('configs/models.yaml')
width = models[opt.model].width
height = models[opt.model].height
config = models[opt.model].config
weights = models[opt.model].weights

Hopefully that answers your question?

On a more personal note, I do prefer --model to --weights. A couple of examples why:

python scripts/dream.py --model SD14 can do everything that python scripts/dream.py --weights models/ldm/stable-diffusion-v1/model.ckpt did, and it's easier to remember. All that's necessary for that syntax would be adjusting the name in the models.yaml to be SD14 instead of stable-diffusion-1.4

(On a technicality, the first syntax was possible with --weights SD14, but it did require the weights to be at models/ldm/stable-diffusion-v1/SD14.ckpt, with --model that restriction isn't present.)

If I wish to tweak some values in the inference.yaml to see how it affects the output, I can create an entry in the models.yaml for my experiments. Then, when I'm done, I can just switch which --model I'm using, and return to vanilla Stable Diffusion. None of the core code for actually running the model needs to change.

Neosettler commented 2 years ago

I mean, old models are still relevant for several reasons. Switching model on the fly is a must have for the kind of experiments I'm after. It does answer my question, thank you mad!