Closed enzyme69 closed 1 year ago
My feeling also is that the processing time is incorrectly stated. The GPU and CPU seems maxed, but speed is still slow. While if I use Diffusion Bee it's faster and somewhat I could still do other thing.
Took me 20 minutes to generate the output:
Quality is pretty good, but still too slow. If I use Stable Diffusion WebUI, 6-8 minutes.
Apple's CoreML loading time is slow (python vs swift mode), but I could get 20-40 seconds.
Generating 🖼 1/1: "a drawing of a girl sitting on a pillow with a flower in her hand and a piggy bank in front of her with a flower in her hand" 512x512px seed:335075681 prompt-strength:7.5 steps:15 sampler-type:k_dpmpp_2m
100%|████████████████████████████████████████████████████| 15/15 [19:35<00:00, 78.36s/it]
Downloading: 100%|███████████████████████████████████████| 342/342 [00:00<00:00, 166kB/s]
Downloading: 100%|██████████████████████████████████| 4.44k/4.44k [00:00<00:00, 1.01MB/s]
Downloading: 100%|██████████████████████████████████| 1.13G/1.13G [01:33<00:00, 13.0MB/s]
Image Generated. Timings: conditioning:1.12s sampling:1175.51s safety-filter:102.11s total:1282.94s
🖼 [generated] saved to: ./outputs/generated/000006_335075681_kdpmpp2m15_PS7.5_a_drawing_of_a_girl_sitting_on_a_pillow_with_a_flower_in_her_hand_and_a_piggy_bank_in_front_of_her_with_a_flower_in_her_hand_[generated].jpg
May not be much help, but here are some observations..
Using 7.0.0 on M1 16GB gives me python memory use readings as high as 20GB (that's 8GB out on swap!) so I think running on 8GB physical is always going to be ambitious for the SD-2.0 codebase.
Pretty sure DiffusionBee is on SD-1.5 codebase
If you install Imaginairy v5.1.0 (which is also the SD-1.5 codebase I think) you should get generate times similar to DiffusionBee from my experience
hope this helps
I think diffusion bee uses a different architecture entirely. It may run a lot more efficiently.
@enzyme69 To make sure I'm understanding correctly, you're saying it takes 20 minutes with imaginairy, 7 minutes with automatic webui, and 30 seconds with CoreML. I didn't realize the CoreML model was out and working. I would like to integrate that but realistically not sure when I'll find the time.
@Cybergate9 The 2.0 model shouldn't be taking more memory... I think. I consider it a bug if performance is worse in 7.0 than 5.1. The 2.0v model however, does require more memoery and does run slower.
Thanks for explanation.
(base) blendersushi@192-168-1-102 ~ % imagine --model SD-2.0 "a giant smiling face stone at a greenforest"
🤖🧠 imaginAIry received 1 prompt(s) and will repeat them 1 times to create 1 images.
Generating 🖼 1/1: "a giant smiling face stone at a greenforest" 512x512px seed:878376886 prompt-strength:7.5 steps:15 sampler-type:k_dpmpp_2m
Loading model /Users/blendersushi/.cache/huggingface/transformers/24bd254e54b30e83bcbc15efae29f0ef55256fd144823a9437f5956e594f6803.dcd6f0dd97c55495efb8393e64f704f3f398d695965876be0339bf96b93e2b4e onto mps:0 backend...
Downloading: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████| 3.94G/3.94G [05:06<00:00, 12.9MB/s]
13%|██████████████████ | 2/15 [01:12<07:58, 36.78s/i 20%|███████████████████████████ | 3/15 [01:46<07:05, 35.48s/i 27%|████████████████████████████████████ | 4/15 [02:15<06:02, 32.97s/i 33%|█████████████████████████████████████████████ | 5/15 [02:46<05:21, 32.11s/i 40%|██████████████████████████████████████████████████████ | 6/15 [03:18<04:50, 32.26s/i 47%|███████████████████████████████████████████████████████████████ | 7/15 [03:53<04:22, 32.87s/i 53%|████████████████████████████████████████████████████████████████████████ | 8/15 [04:23<03:44, 32.03s/i 60%|█████████████████████████████████████████████████████████████████████████████████ | 9/15 [04:53<03:08, 31.46s/i 67%|█████████████████████████████████████████████████████████████████████████████████████████▎ | 10/15 [05:24<02:36, 31.38s/i 73%|██████████████████████████████████████████████████████████████████████████████████████████████████▎ | 11/15 [05:56<02:05, 31.41s/i 80%|███████████████████████████████████████████████████████████████████████████████████████████████████████████▏ | 12/15 [06:31<01:37, 32.47s/i 87%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏ | 13/15 [07:06<01:06, 33.24s/i 93%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████ | 14/15 [07:38<00:32, 32.90s/i100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 15/15 [08:12<00:00, 33.30s/i100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 15/15 [08:12<00:00, 32.83s/it]
Image Generated. Timings: conditioning:3.98s sampling:492.81s safety-filter:6.42s total:506.08s
🖼 [generated] saved to: ./outputs/generated/000007_878376886_kdpmpp2m15_PS7.5_a_giant_smiling_face_stone_at_a_greenforest_[generated].jpg
After a restart and closing all apps, I did get a faster result around 8 minutes. This might be as good as it gets for 8 GB CPU. I need 16 - 32 GB with M1Max or M2.
That's right
@Cybergate9 The 2.0 model shouldn't be taking more memory... I think. I consider it a bug if performance is worse in 7.0 than 5.1. The 2.0v model however, does require more memoery and does run slower.
nope, difference has always been there for me, between 6.x-7.x and 5.x, i.e. same command line, same model:
imagine "a picture of an 18th century lady in style of decoupage" --model SD-1.5
Generating 🖼 1/1: "a picture of an 18th century lady in style of decoupage" 512x512px seed:907966062 prompt-strength:7.5 steps:15 sampler-type:k_dpmpp_2m
Loading model /Users/shaun/.cache/huggingface/transformers/4c1a32af58eeaff9f36410f7ca27e51a8856185c3f05d5b930975a1397914f10.98fc1312797017a8bac6993df565908fd18f09319b40d9bd35457dfa1459ecf0 onto mps:0 backend...
100%|███████████████████████████████████████████████| 15/15 [00:17<00:00, 1.19s/it]
Image Generated. Timings: conditioning:0.22s sampling:17.90s safety-filter:5.52s total:24.58s
Generating 🖼 1/1: "a picture of an 18th century lady in style of decoupage" 512x512px seed:779724400 prompt-strength:7.5 steps:15 sampler-type:k_dpmpp_2m
Loading model /Users/shaun/.cache/huggingface/transformers/4c1a32af58eeaff9f36410f7ca27e51a8856185c3f05d5b930975a1397914f10.98fc1312797017a8bac6993df565908fd18f09319b40d9bd35457dfa1459ecf0 onto mps:0 backend...
100%|███████████████████████████████████████████| 15/15 [01:50<00:00, 7.36s/it]
Image Generated. Timings: conditioning:0.30s sampling:110.45s safety-filter:6.60s total:119.26s
no idea why, always assumed new codebase released for SD-2 and incorporated since 6.0.0a was the difference?
@enzyme69 To make sure I'm understanding correctly, you're saying it takes 20 minutes with imaginairy, 7 minutes with automatic webui, and 30 seconds with CoreML. I didn't realize the CoreML model was out and working. I would like to integrate that but realistically not sure when I'll find the time.
CoreML initial at: https://github.com/apple/ml-stable-diffusion My two cents worth:
I think I need 15-20 minutes running a simple prompt, at 20 samples.
I wonder if there is should be an option to run using CoreML Stable Diffusion?
Memory error also happened Loading model /Users/blendersushi/.cache/huggingface/transformers/4c1a32af58eeaff9f36410f7ca27e51a8856185c3f05d5b930975a1397914f10.98fc1312797017a8bac6993df565908fd18f09319b40d9bd35457dfa1459ecf0 onto mps:0 backend... 80%|█████████████████████████████████████████▌ | 12/15 [14:01<03:30, 70.10s/it] RuntimeError: Not enough memory, use lower resolution (max approx. 448x448). Need: 0.0GB free, Have:0.0GB free