Closed Neosettler closed 2 years ago
-sampler is set during initialization, is that necessary since it can be set in the main loop?
Yes. Because this repo allows you to change sampler on the fly for each prompt without needing to restart the entire application.
The init argument is for boot and the prompt argument is for prompts. If you don't se a prompt argument, it'll default to your init argument.
-full_precision is set only at initialization, there is none for the main loop. If changed after initialization, does this means we must reload the model for precision to take effect?
Yes.
-n_sample I had a hard time figuring out how "n_sample" and "n_iter" differed before. It seems to have kept only "iterations" now, could anyone clear this up? Why is n_sample not needed
Because most systems were completely incapable of creating multiple samples in the same batch. Iterations performed the same job while still providing accurate seeds for each generated image and actually being able to be run across systems. And because this repo runs per prompt, you don't really ever find the need to use batch. And on top of that, the programming difficulties batch size was creating, we decided to get rid of it.
You can just use iterations and you will be perfectly fine.
-model we used to be able to pass the model file, it was very handy to swap model very quickly. Now, there is a bunch of hard coded stuff to be cooked like the name of the model files and some config madness. What happened here?
This is a lot more organized and streamlined now. You can setup your model once in the config/models.yaml
and then use the -model
argument to switch between them. If you don't want to set them up, you can still use the --weights
to override everything and load like you did before.
Thank you for your support blessedcoolant.
Mind me asking for a use case example with --weights?
Note: It seems like it as been deprecated the hard way.
Note: It seems like it as been deprecated the hard way.
Yes. It has been. My bad. So much to keep track of.
So I'd really recommend configuring models.yaml
. Takes a few seconds and allows for easy switching that you are looking for.
1.5 is coming soon, it seems like adding a new model is not trivial... will it set us back waiting for implementation from this repo?
1.5 is coming soon, it seems like adding a new model is not trivial... will it set us back waiting for implementation from this repo?
Implementing a new model would take seconds -- literally. As long as there are no architectural changes which 1.5 most likely will not have. So the time you really have to wait is for someone to add those few lines of code and push the button.
But even better, you don't need to wait for anyone. You can just copy paste the existing few lines in model.yaml
, change 1.4 to 1.5 and you'll be good to go yourself. Actually, I think even replacing the older model.ckpt
with the new one will do the trick too without any other required changes.
What you really need help for is when a massive upgrade happens. At that point, there's a TON of active devs on this repo every day upgrading things asap.
Exactly as @blessedcoolant says. Upgrading to the 1.5 weights should be a snap. We'll post step-by-step instructions here.
Thank you so much for your inputs,
Being in the midst of connecting every single parameters to UI components. For clarity sake, my suggestion would be to add an argument category='checkpoint'|'GFPGAN'|ETC.
parser.add_argument(
'-F',
'--full_precision',
category='checkpoint',
dest='full_precision',
action='store_true',
help='Use more memory-intensive full precision math for calculations',
)
Further more, add "static" or "dynamic" visibility to inform that modifying a parameter would be taken into account only after reloading the entire model. That is VERY IMPORTANT to know!
parser.add_argument(
'-F',
'--full_precision',
category='checkpoint',
static='yes',
dest='full_precision',
action='store_true',
help='Use more memory-intensive full precision math for calculations',
)
Implementing a new model would take seconds
Even if it takes seconds, it still seems like an unnecessary process. Could I make a feature request to keep the -weights option available!?
So I'd really recommend configuring models.yaml. Takes a few seconds and allows for easy switching that you are looking for.
I have yet to understand why config files are necessary. One would think that config files could be generated by parsing the checkpoint directly. If there is no other way around this, consider keeping the official naming convention (naming all checkpoints model.ckpt have major draw backs):
stable-diffusion-v1 sd-v1-1.ckpt sd-v1-1-full-ema.ckpt sd-v1-2.ckpt sd-v1-2-full-ema.ckpt sd-v1-3.ckpt sd-v1-3-full-ema.ckpt sd-v1-4.ckpt sd-v1-4-full-ema.ckpt sd-v1-5.ckpt (soon) sd-v1-5-full-ema.ckpt (soon)
Maybe adding config files for every other known/compatible checkpoint that exists to date, if any?
If someone could explain how the config file's data are being populated, I might just do it if that is an options.
If someone could explain how the config file's data are being populated, I might just do it if that is an options.
It's pretty straightfoward.
stable-diffusion-1.4: # name of the model you want to use
config: configs/stable-diffusion/v1-inference.yaml # location of the config file for the model that is supplied with the code
weights: models/ldm/stable-diffusion-v1/model.ckpt # location of the checkpoints for that model
width: 512 # the width at which the model was trained
height: 512 # the height at which the model was trained
Passing the path of the check point seems much more elegant. I don't understand why the config files are needed. Is it possible to find and old version of the repo where the -weights option was implemented?
Passing the path of the check point seems much more elegant. I don't understand why the config files are needed.
Because it is a lot cleaner. Set it up once and it's good to go.
Is it possible to find and old version of the repo where the -weights option was implemented?
Yes. You can find older releases on this repo. I wouldn't recommend using them though because so many bugs have been fixed since then and numerous features have been added.
This a very simple and straight forward change that actually streamlines the process. I suggest you make your peace with it.
Bear in mind that I'm not trying to be combative here. I'm willing to integrate all models but it's still unclear how to populating the config files.
Current SD model config have been put into a folder named "v1-inference", should all configs for the models below put in there?
sd-v1-1.ckpt sd-v1-1-full-ema.ckpt sd-v1-2.ckpt sd-v1-2-full-ema.ckpt sd-v1-3.ckpt sd-v1-3-full-ema.ckpt sd-v1-4.ckpt sd-v1-4-full-ema.ckpt sd-v1-5.ckpt (soon) sd-v1-5-full-ema.ckpt (soon)
You can setup your model once in the config/models.yaml and then use the -model argument to switch between them.
How would you go about making the configs attached to the checkpoints?
v1-inference.yaml:
model:
base_learning_rate: 1.0e-04
target: ldm.models.diffusion.ddpm.LatentDiffusion
params:
linear_start: 0.00085
linear_end: 0.0120
num_timesteps_cond: 1
log_every_t: 200
timesteps: 1000
first_stage_key: "jpg"
cond_stage_key: "txt"
image_size: 64
channels: 4
cond_stage_trainable: false # Note: different from the one we trained before
conditioning_key: crossattn
monitor: val/loss_simple_ema
scale_factor: 0.18215
use_ema: False
scheduler_config: # 10000 warmup steps
target: ldm.lr_scheduler.LambdaLinearScheduler
params:
warm_up_steps: [ 10000 ]
cycle_lengths: [ 10000000000000 ] # incredibly large number to prevent corner cases
f_start: [ 1.e-6 ]
f_max: [ 1. ]
f_min: [ 1. ]
personalization_config:
target: ldm.modules.embedding_manager.EmbeddingManager
params:
placeholder_strings: ["*"]
initializer_words: ["sculpture"]
per_image_tokens: false
num_vectors_per_token: 1
progressive_words: False
unet_config:
target: ldm.modules.diffusionmodules.openaimodel.UNetModel
params:
image_size: 32 # unused
in_channels: 4
out_channels: 4
model_channels: 320
attention_resolutions: [ 4, 2, 1 ]
num_res_blocks: 2
channel_mult: [ 1, 2, 4, 4 ]
num_heads: 8
use_spatial_transformer: True
transformer_depth: 1
context_dim: 768
use_checkpoint: True
legacy: False
first_stage_config:
target: ldm.models.autoencoder.AutoencoderKL
params:
embed_dim: 4
monitor: val/rec_loss
ddconfig:
double_z: true
z_channels: 4
resolution: 256
in_channels: 3
out_ch: 3
ch: 128
ch_mult:
- 1
- 2
- 4
- 4
num_res_blocks: 2
attn_resolutions: []
dropout: 0.0
lossconfig:
target: torch.nn.Identity
cond_stage_config:
target: ldm.modules.encoders.modules.FrozenCLIPEmbedder
How are those values populated? Are they relevant to generating images?
I believe that most (all?) versions of stable diffusion use the same inferencing config. Other diffusion models may use a different config. For example, Latent Diffusion uses a different config, as you can see in the models.yaml
.
For the versions of Stable Diffusion you desire to make entries in models.yaml
, sd-v1-1 through sd-v1-5, the only thing that does need to change is the model. That does lead credibility to the desired usecase of using --weights
.
For example, I recently downloaded waifu-diffusion, (A stable-diffusion model further trained on the Danbooru2021 dataset). Getting it to work with this repo was as simple as adding the following to models.yaml
:
waifu-diffusion-1.2: # name of the model you want to use
config: configs/stable-diffusion/v1-inference.yaml
weights: models/ldm/waifu-diffusion-v1/WD12.ckpt
width: 512 # the width at which the model was trained
height: 512 # the height at which the model was trained
and then on the command line:
python scripts/dream.py --model waifu-diffusion-1.2
Admittedly, the reason it was so easy was because waifu-diffusion is just an advancement on stable-diffusion 1.4. Keep in mind however, there is no guarantee that the config will be the same for other diffusion models.
Easy then:
laion400m:
config: configs/latent-diffusion/txt2img-1p4B-eval.yaml
weights: models/ldm/text2img-large/model.ckpt
width: 256
height: 256
stable-diffusion-1.1:
config: configs/stable-diffusion/v1-inference.yaml
weights: models/ldm/stable-diffusion-v1/sd-v1-1.ckpt
width: 512
height: 512
stable-diffusion-1.1-full-ema:
config: configs/stable-diffusion/v1-inference.yaml
weights: models/ldm/stable-diffusion-v1/sd-v1-1-full-ema.ckpt
width: 512
height: 512
stable-diffusion-1.2:
config: configs/stable-diffusion/v1-inference.yaml
weights: models/ldm/stable-diffusion-v1/sd-v1-2.ckpt
width: 512
height: 512
stable-diffusion-1.2-full-ema:
config: configs/stable-diffusion/v1-inference.yaml
weights: models/ldm/stable-diffusion-v1/sd-v1-2-full-ema.ckpt
width: 512
height: 512
stable-diffusion-1.3:
config: configs/stable-diffusion/v1-inference.yaml
weights: models/ldm/stable-diffusion-v1/sd-v1-3.ckpt
width: 512
height: 512
stable-diffusion-1.3-full-ema:
config: configs/stable-diffusion/v1-inference.yaml
weights: models/ldm/stable-diffusion-v1/sd-v1-3-full-ema.ckpt
width: 512
height: 512
stable-diffusion-1.4:
config: configs/stable-diffusion/v1-inference.yaml
weights: models/ldm/stable-diffusion-v1/sd-v1-4.ckpt
width: 512
height: 512
stable-diffusion-1.4-full-ema:
config: configs/stable-diffusion/v1-inference.yaml
weights: models/ldm/stable-diffusion-v1/sd-v1-4-full-ema.ckpt
width: 512
height: 512
stable-diffusion-1.5:
config: configs/stable-diffusion/v1-inference.yaml
weights: models/ldm/stable-diffusion-v1/sd-v1-5.ckpt
width: 512
height: 512
stable-diffusion-1.5-full-ema:
config: configs/stable-diffusion/v1-inference.yaml
weights: models/ldm/stable-diffusion-v1/sd-v1-5-full-ema.ckpt
width: 512
height: 512
Thank you mad, very good input. I'm still curious why are those config files needed. I guess I should rephrase the question: Is the confg values necessary to generate images and make the current system work or are they informative only?
The configs themselves are an important part of the system. As far as I can tell, parts of the model get launched from within the config.
Take a look: There's a target, which is associated with a file in the structure, and params it loads that target with, the params specified.
target: ldm.models.diffusion.ddpm.LatentDiffusion
params:
linear_start: 0.00085
linear_end: 0.0120
num_timesteps_cond: 1
log_every_t: 200
timesteps: 1000
first_stage_key: "jpg"
cond_stage_key: "txt"
image_size: 64
channels: 4
cond_stage_trainable: false # Note: different from the one we trained before
conditioning_key: crossattn
monitor: val/loss_simple_ema
scale_factor: 0.18215
use_ema: False
...
For example, loads ldm.models.diffusion.ddpm.py and more specifically, the class LatentDiffusion. You can see that the values in the params section are consumed here. (I think I also see some of those values being used in the DDPM class, too.)
So it definitely looks important!
Marvelous, thank you mad. It does raise the question of why those config files where not necessary while using the -weights option. Maybe the values were hard coded to handle only a specific type of model? Plus, it seems that there is not many compatible models yet to justify that config mechanic. In any cases, I'm curious to know how those values where populated.
If you look at the original CompVis repo, you can see that there's an opt-in for --laion400m
. This was the only way to switch off of using the default model.ckpt and config.
--weights
, introduced in https://github.com/lstein/stable-diffusion/commit/ed513397b255868a9c0afe6dd7e580005b5d32bb, was mutually exclusive with --laion400m
.
If you used --laion400m
, you wouldn't use --weights
, as --laion400m
overrode it. --laion400m
forcefully specified the model ckpt path and config for latent diffusion.
if opt.laion400m:
# defaults suitable to the older latent diffusion weights
width = 256
height = 256
config = 'configs/latent-diffusion/txt2img-1p4B-eval.yaml'
weights = 'models/ldm/text2img-large/model.ckpt'
If, otherwise, you used --weights
(or let it use the default), it would use the same stable diffusion config (inference.yaml
), regardless of which weights file you input — which we know does indeed work for existing stable diffusion models.
else:
# some defaults suitable for stable diffusion weights
width = 512
height = 512
config = 'configs/stable-diffusion/v1-inference.yaml'
if '.ckpt' in opt.weights:
weights = opt.weights
else:
weights = f'models/ldm/stable-diffusion-v1/{opt.weights}.ckpt'
--model
, introduced in https://github.com/lstein/stable-diffusion/commit/d319b8a76238a79177c14621a1ff4ec9be5100be, merges the logic. with --model
, you can specify exactly which model (config, weights, and training sizes) you wish to launch with.
models = OmegaConf.load('configs/models.yaml')
width = models[opt.model].width
height = models[opt.model].height
config = models[opt.model].config
weights = models[opt.model].weights
Hopefully that answers your question?
On a more personal note, I do prefer --model
to --weights
. A couple of examples why:
python scripts/dream.py --model SD14
can do everything that
python scripts/dream.py --weights models/ldm/stable-diffusion-v1/model.ckpt
did, and it's easier to remember.
All that's necessary for that syntax would be adjusting the name in the models.yaml
to be SD14 instead of stable-diffusion-1.4
(On a technicality, the first syntax was possible with --weights SD14
, but it did require the weights to be at models/ldm/stable-diffusion-v1/SD14.ckpt
, with --model
that restriction isn't present.)
If I wish to tweak some values in the inference.yaml
to see how it affects the output, I can create an entry in the models.yaml
for my experiments. Then, when I'm done, I can just switch which --model
I'm using, and return to vanilla Stable Diffusion. None of the core code for actually running the model needs to change.
I mean, old models are still relevant for several reasons. Switching model on the fly is a must have for the kind of experiments I'm after. It does answer my question, thank you mad!
Greetings,
I'm in the process of migrating from the SD vanilla repo to this one. I have a few questions regarding the arguments/options:
Some argument are duplicated, once at initialization and once for the main loop. That's confusing.
-sampler is set during initialization, is that necessary since it can be set in the main loop?
-full_precision is set only at initialization, there is none for the main loop. If changed after initialization, does this means we must reload the model for precision to take effect?
-n_sample I had a hard time figuring out how "n_sample" and "n_iter" differed before. It seems to have kept only "iterations" now, could anyone clear this up? Why is n_sample not needed?
-model we used to be able to pass the model file, it was very handy to swap model very quickly. Now, there is a bunch of hard coded stuff to be cooked like the name of the model files and some config madness. What happened here?
Any insight would be appreciated, Thank you,