Linaqruf / kohya-trainer

Adapted from for easier cloning
Apache License 2.0
1.84k stars 304 forks source link

BLIP captioning throws an error under specific circumstances #161

Closed oscarnevarezleal closed 1 year ago

oscarnevarezleal commented 1 year ago

What happened?

BLIP captioning throws an error under specific circumstances.

I'm working on a project that relies in this one, when I run I encounter an error. This doesn't happen in Colab which makes me believe there must be something related with my installation and its dependances. Any help would be very much appreciated.


./finetune/ /opt/ml/input/data/train --batch_size 4 --caption_extension .caption --max_data_loader_n_workers 2 --debug 
load images from /opt/ml/input/data/train
found 6 images.
loading GIT: microsoft/git-large-textcaps
Downloading (…)rocessor_config.json: 100%|██████████████████████████████████████████████████████████████████| 503/503 [00:00<00:00, 135kB/s]
Downloading (…)okenizer_config.json: 100%|██████████████████████████████████████████████████████████████████| 453/453 [00:00<00:00, 138kB/s]
Downloading (…)solve/main/vocab.txt: 100%|███████████████████████████████████████████████████████████████| 232k/232k [00:00<00:00, 75.3MB/s]
Downloading (…)/main/tokenizer.json: 100%|████████████████████████████████████████████████████████████████| 711k/711k [00:00<00:00, 200MB/s]
Downloading (…)cial_tokens_map.json: 100%|█████████████████████████████████████████████████████████████████| 125/125 [00:00<00:00, 96.6kB/s]
Downloading (…)lve/main/config.json: 100%|██████████████████████████████████████████████████████████████| 2.82k/2.82k [00:00<00:00, 836kB/s]
Downloading (…)"pytorch_model.bin";: 100%|██████████████████████████████████████████████████████████████| 1.58G/1.58G [00:03<00:00, 414MB/s]
Downloading (…)neration_config.json: 100%|█████████████████████████████████████████████████████████████████| 141/141 [00:00<00:00, 42.4kB/s]
GIT loaded
  0%|                                                                                                                 | 0/2 [00:00<?, ?it/s]
A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer.
| NVIDIA-SMI 515.65.01    Driver Version: 515.65.01    CUDA Version: 11.7     |
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  NVIDIA A10G         On   | 00000000:00:1E.0 Off |                    0 |
|  0%   23C    P8    23W / 300W |      0MiB / 23028MiB |      0%      Default |
|                               |                      |                  N/A |

| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|  No running processes found                                                 |

Python 3.9, CUDA 11.6, torch 1.13, Deps below

    - toml
    - opencv-python
    - prettytable
    - triton==2.0.0.dev20221120
    - wandb
    - pillow==9.1.0
    - accelerate==0.15.0
    - transformers==4.26.0
    - ftfy==6.1.1
    - albumentations==1.3.0
    - opencv-python==
    - einops==0.6.0
    - diffusers[torch]==0.10.2
    - pytorch-lightning==1.9.0
    - bitsandbytes==0.35.0
    - tensorboard==2.10.1
    - safetensors==0.2.6
    - tensorflow==2.10.1
    - requests==2.28.2
    - huggingface-hub==0.12.0
    - timm==0.6.12
    - fairscale==0.4.13
    - lion-pytorch==0.0.6

Expected behavior

Captioning works as in Colab

Linaqruf commented 1 year ago

Hi, you probably accidentally using GIT and not blip. load images from /opt/ml/input/data/train found 6 images. loading GIT: microsoft/git-large-textcaps