basujindal / stable-diffusion

Optimized Stable Diffusion modified to run on lower GPU VRAM
Other
3.14k stars 469 forks source link

Prompt weights are ignored if they're immediately followed by a comma #173

Open xcorvis opened 1 year ago

xcorvis commented 1 year ago

When you follow a prompt weight with a comma it screws up the parsing. Bad: "portrait, Jason Momoa:0.1, blah" Good: "portrait, Jason Momoa:0.1 , blah"

Expected behavior should be that the weight is read correctly even if it bumps up against the separator character.

The prompt is still valid and defaults back to a weight of 1 so the job finishes, but we see an error "Warning: '0.1,' is not a value, are you missing a space?". Aside: the way the error is presented should also be fixed, it's just sort of jammed in there. Should probably happen before Sampling output is started.

Example of bad in action:

python optimizedSD/optimized_txt2img.py --W 512 --H 512 --skip_grid --n_iter 1 --n_samples 5 --prompt "portrait, Jason Momoa:0.1, highly detailed, trending on art station, picture of the day" --ddim_steps 50 --scale 7.5 --seed 987388 --outdir ../output/2022-09-13-demo
Global seed set to 987388
Loading model from models/ldm/stable-diffusion-v1/model.ckpt
Global Step: 470000
UNet: Running in eps-prediction mode
CondStage: Running in eps-prediction mode
FirstStage: Running in eps-prediction mode
making attention of type 'vanilla' with 512 in_channels
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
making attention of type 'vanilla' with 512 in_channels
Sampling:   0%|                                                                                                                      | 0/1 [00:00<?, ?it/sWarning: '0.1,' is not a value, are you missing a space?                                                                              | 0/1 [00:00<?, ?it/s]
seeds used =  [987388, 987389, 987390, 987391, 987392]
Data shape for PLMS sampling is [5, 4, 64, 64]
Running PLMS Sampling with 50 timesteps
PLMS Sampler: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████| 50/50 [01:11<00:00,  1.43s/it]torch.Size([5, 4, 64, 64])
saving images 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████| 50/50 [01:11<00:00,  1.37s/it]
memory_final =  5.382656
data: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [01:16<00:00, 76.65s/it]
Sampling: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [01:16<00:00, 76.65s/it]
Samples finished in 1.45 minutes and exported to ../output/2022-09-13-demo/portrait,_Jason_Momoa_0.1,_highly_detailed,_trending_on_art_station,_picture_of_the_day
 Seeds used = 987388,987389,987390,987391,987392

example of good: (no error)


python optimizedSD/optimized_txt2img.py --W 512 --H 512 --skip_grid --n_iter 1 --n_samples 5 --prompt "portrait, Jason Momoa:0.1 , highly detailed, trending on art station, picture of the day" --ddim_steps 50 --scale 7.5 --seed 987388 --outdir ../output/2022-09-13-demo
Global seed set to 987388
Loading model from models/ldm/stable-diffusion-v1/model.ckpt
Global Step: 470000
UNet: Running in eps-prediction mode
CondStage: Running in eps-prediction mode
FirstStage: Running in eps-prediction mode
making attention of type 'vanilla' with 512 in_channels
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
making attention of type 'vanilla' with 512 in_channels
Sampling:   0%|                                                                                                                      | 0/1 [00:00<?, ?it/sseeds used =  [987388, 987389, 987390, 987391, 987392]                                                                                | 0/1 [00:00<?, ?it/s]
Data shape for PLMS sampling is [5, 4, 64, 64]
Running PLMS Sampling with 50 timesteps
PLMS Sampler: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████| 50/50 [01:11<00:00,  1.43s/it]torch.Size([5, 4, 64, 64])
saving images 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████| 50/50 [01:11<00:00,  1.38s/it]
memory_final =  5.382656
data: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [01:16<00:00, 76.29s/it]
Sampling: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [01:16<00:00, 76.29s/it]
Samples finished in 1.45 minutes and exported to ../output/2022-09-13-demo/portrait,_Jason_Momoa_0.1_,_highly_detailed,_trending_on_art_station,_picture_of_the_day
 Seeds used = 987388,987389,987390,987391,987392```
tazzkiller commented 1 year ago

you don't need separator after weight value, the weight acts as a separator.

xcorvis commented 1 year ago

Good to know, thanks. I was writing some prompt scripting and noticed this, and also noticed it with another tool that assembled prompts.