rinongal / textual_inversion

MIT License
2.87k stars 278 forks source link

quality of inferenced output images are worse than that of "samples_scaled" images in log with stable diffusion #144

Open KaiyueSun98 opened 1 year ago

KaiyueSun98 commented 1 year ago

Hi,

I use configs/stable-diffusion/v1-finetune.yaml for inversion and configs/stable-diffusion/v1-inference.yaml for inference (by replacing the original config in script/txt2img.py with configs/stable-diffusion/v1-inference.yaml). other parameters remain unchanged.

when I use stable diffusion 1.4 to generate images (with the same SD model doing the inversion),

python scripts/txt2img.py --ddim_eta 0.0 
                          --n_samples 8 
                          --n_iter 2 
                          --scale 10.0 
                          --ddim_steps 50 
                          --embedding_path /path/to/logs/trained_model/checkpoints/embeddings_gs-5049.pt 
                          --ckpt_path /path/to/pretrained/model.ckpt 
                          --prompt "a photo of *"

the output images looks worse than the samples_scaled images in log. Are the way samples_scaled images generated consistent with the way output images generated? or Are there any other parameters for inference I need to change to fit the SD model?

rinongal commented 1 year ago

The samples_scaled images use a guidance scale of 5.0 and you are using 10.0. If the poor quality you are seeing is oversaturation and similar effects, this is likely the reason. Try to drop the guidance scale a bit (stable diffusion typically uses 7.5).

wonjunior commented 1 year ago

@KaiyueSun98 Which weight file did you provide to --ckpt_path to load stable diffusion?

KaiyueSun98 commented 1 year ago

Hi @wonjunior, I downloaded the weight file "sd-v1-4.ckpt" from hugging face https://huggingface.co/CompVis/stable-diffusion-v-1-4-original

kaneyxx commented 1 year ago

@KaiyueSun98 Hello, I've downloaded the weight as what you mentioned and used the configs/stable-diffusion/v1-finetune.yaml for training. There is an error code like Some weights of the model checkpoint at openai/clip-vit-large-patch14 were not used when initializing CLIPTextModel: and followed by a lot of layer name. Did you have the same situation?

XiaominLi1997 commented 1 year ago

@KaiyueSun98 Hello, I've downloaded the weight as what you mentioned and used the configs/stable-diffusion/v1-finetune.yaml for training. There is an error code like Some weights of the model checkpoint at openai/clip-vit-large-patch14 were not used when initializing CLIPTextModel: and followed by a lot of layer name. Did you have the same situation?

Hi, did you resolve this issue? Maybe I need some help. Thanks.

kaneyxx commented 1 year ago

@KaiyueSun98 Hello, I've downloaded the weight as what you mentioned and used the configs/stable-diffusion/v1-finetune.yaml for training. There is an error code like Some weights of the model checkpoint at openai/clip-vit-large-patch14 were not used when initializing CLIPTextModel: and followed by a lot of layer name. Did you have the same situation?

Hi, did you resolve this issue? Maybe I need some help. Thanks.

I found this issue will not affect the pipeline, so I just ignored it lol.

XiaominLi1997 commented 1 year ago

Thanks : )