pharmapsychotic / clip-interrogator

Image to prompt with BLIP and CLIP
MIT License
2.71k stars 431 forks source link

Loading flavors.txt is very slow #99

Closed pmeems closed 1 year ago

pmeems commented 1 year ago

I run on

I use

I run run_cli.py as python run_cli.py -c "ViT-H-14/laion2b_s32b_b79k" -f "d:\MyFolder"

Loading the flavors.txt is taken very long. It is now running almost 2 hours and is just at 16%:

python run_cli.py -c "ViT-H-14/laion2b_s32b_b79k" -f "D:\MyFolder"
CUDA is available and will be used.
CUDO version: 12.1
Loading caption model blip-large...
Loading CLIP model ViT-H-14/laion2b_s32b_b79k...
ViT-H-14_laion2b_s32b_b79k_artists.safetensors: 100%|█████████████████████████████| 21.6M/21.6M [00:01<00:00, 16.0MB/s]
Preprocessing artists:   0%|                                                                     | 0/1 [00:00<?, ?it/s]
  attn_output = scaled_dot_product_attention(q, k, v, attn_mask, dropout_p, is_causal)
Preprocessing artists: 100%|█████████████████████████████████████████████████████████████| 1/1 [00:10<00:00, 10.64s/it]
ViT-H-14_laion2b_s32b_b79k_flavors.safetensors: 100%|███████████████████████████████| 207M/207M [00:10<00:00, 20.1MB/s]
ViT-H-14_laion2b_s32b_b79k_mediums.safetensors: 100%|███████████████████████████████| 195k/195k [00:00<00:00, 2.16MB/s]
ViT-H-14_laion2b_s32b_b79k_movements.safetensors: 100%|█████████████████████████████| 410k/410k [00:00<00:00, 4.89MB/s]
ViT-H-14_laion2b_s32b_b79k_trendings.safetensors: 100%|█████████████████████████████| 148k/148k [00:00<00:00, 2.91MB/s]
Preprocessing trendings: 100%|███████████████████████████████████████████████████████████| 1/1 [00:00<00:00,  5.46it/s]
ViT-H-14_laion2b_s32b_b79k_negative.safetensors: 100%|████████████████████████████| 84.2k/84.2k [00:00<00:00, 3.47MB/s]
Loaded CLIP model and data in 43.98 seconds.
100%|██████████████████████████████████████████████████████████████████████████████████| 50/50 [00:00<00:00, 73.53it/s]
Flavor chain:  16%|█████████▋                                                    | 5/32 [1:59:06<10:15:18, 1367.37s/it]

Months ago I used the same command it was very fast. In the meantime I, obviously, installed Windows updates, got the latest version of this repo, and updated my CUDA to v12.1 (also tried 12.2, but then Torch didn't recognize the GPU).

While running this script my CPU is at 27%, my memory is at 86% and my GPU is at 3% What can I do to speed it up?

Edit I removed all lines in flavor.txt except for the first 5 lines. Now the Flavor chain is much faster but it still takes 45-60 minutes per image (928x1312px, 468kB). And it looks my GPU isn't used: image

Earlier versions took 3-5 minutes per image. What versions of what package should I use to get the speed back?

genevera commented 1 year ago

I had luck changing the LabelTable object's flavor_intermediate_count (ln 53) to a lower number

pmeems commented 1 year ago

Thanks @genevera I changed the value in the low_vram section from 1024 to 512 and used the --low_vram parameter and now it works again. It takes about 2 minutes for each image which is fine for me.

I did notice the GPU is not doing a lot, is that expected? I thought this script would use the GPU as well.