(Caption colab) PIL.UnidentifiedImageError: cannot identify image file

Jaobsz commented 1 year ago

Hey, I wanted to try the dreambooth colab with captions ( https://github.com/TheLastBen/fast-stable-diffusion/blob/Captions_dir/fast-DreamBooth.ipynb ), and it was working until recently, but now I keep having an error:

Training the UNet...
'########:'########:::::'###::::'####:'##::: ##:'####:'##::: ##::'######:::
... ##..:: ##.... ##:::'## ##:::. ##:: ###:: ##:. ##:: ###:: ##:'##... ##::
::: ##:::: ##:::: ##::'##:. ##::: ##:: ####: ##:: ##:: ####: ##: ##:::..:::
::: ##:::: ########::'##:::. ##:: ##:: ## ## ##:: ##:: ## ## ##: ##::'####:
::: ##:::: ##.. ##::: #########:: ##:: ##. ####:: ##:: ##. ####: ##::: ##::
::: ##:::: ##::. ##:: ##.... ##:: ##:: ##:. ###:: ##:: ##:. ###: ##::: ##::
::: ##:::: ##:::. ##: ##:::: ##:'####: ##::. ##:'####: ##::. ##:. ######:::
:::..:::::..:::::..::..:::::..::....::..::::..::....::..::::..:::......::::

  0% 0/6000 [00:00<?, ?it/s]Traceback (most recent call last):
  File "/content/diffusers/examples/dreambooth/train_dreambooth.py", line 852, in <module>
    main()
  File "/content/diffusers/examples/dreambooth/train_dreambooth.py", line 672, in main
    for step, batch in enumerate(train_dataloader):
  File "/usr/local/lib/python3.8/dist-packages/accelerate/data_loader.py", line 348, in __iter__
    current_batch = next(dataloader_iter)
  File "/usr/local/lib/python3.8/dist-packages/torch/utils/data/dataloader.py", line 628, in __next__
    data = self._next_data()
  File "/usr/local/lib/python3.8/dist-packages/torch/utils/data/dataloader.py", line 671, in _next_data
    data = self._dataset_fetcher.fetch(index)  # may raise StopIteration
  File "/usr/local/lib/python3.8/dist-packages/torch/utils/data/_utils/fetch.py", line 58, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/usr/local/lib/python3.8/dist-packages/torch/utils/data/_utils/fetch.py", line 58, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/content/diffusers/examples/dreambooth/train_dreambooth.py", line 350, in __getitem__
    instance_image = Image.open(path)
  File "/usr/local/lib/python3.8/dist-packages/PIL/Image.py", line 2895, in open
    raise UnidentifiedImageError(
PIL.UnidentifiedImageError: cannot identify image file '/content/gdrive/MyDrive/Fast-Dreambooth/Sessions/GothOutfit/instance_images/gothoutfit-(27).txt'
  0% 0/6000 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/usr/local/bin/accelerate", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.8/dist-packages/accelerate/commands/accelerate_cli.py", line 43, in main
    args.func(args)
  File "/usr/local/lib/python3.8/dist-packages/accelerate/commands/launch.py", line 837, in launch_command
    simple_launcher(args)
  File "/usr/local/lib/python3.8/dist-packages/accelerate/commands/launch.py", line 354, in simple_launcher
    raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['/usr/bin/python3', '/content/diffusers/examples/dreambooth/train_dreambooth.py', '--external_captions', '--image_captions_filename', '--train_only_unet', '--stop_text_encoder_training=500', '--save_starting_step=1000', '--save_n_steps=1000', '--Session_dir=/content/gdrive/MyDrive/Fast-Dreambooth/Sessions/GothOutfit', '--pretrained_model_name_or_path=/content/stable-diffusion-custom', '--instance_data_dir=/content/gdrive/MyDrive/Fast-Dreambooth/Sessions/GothOutfit/instance_images', '--captions_dir=/content/gdrive/MyDrive/Fast-Dreambooth/Sessions/GothOutfit/captions', '--output_dir=/content/models/GothOutfit', '--instance_prompt=', '--seed=68267', '--resolution=512', '--mixed_precision=fp16', '--train_batch_size=1', '--gradient_accumulation_steps=1', '--gradient_checkpointing', '--use_8bit_adam', '--learning_rate=2e-06', '--lr_scheduler=polynomial', '--lr_warmup_steps=0', '--max_train_steps=6000']' returned non-zero exit status 1.
Something went wrong

The error:

PIL.UnidentifiedImageError: cannot identify image file '/content/gdrive/MyDrive/Fast-Dreambooth/Sessions/GothOutfit/instance_images/gothoutfit-(27).txt'
  0% 0/6000 [00:00<?, ?it/s]

I tried multiple time, from last session or a new one, cleaning my googledrive and reuploading my images, nothing work. I'm using external caption as manual caption is broken.

The .txt number doesn't matter (it was out of 30), sometime that error happen at the 1st, 4rd, etc, doesn't matter.

I guess an update broke it.

PS: Thanks a lot for this repo, it's very useful!

While I'm here, is there a proven difference of efficacy with captions? While writing the captions, should I write "[...] wearing a gothoufit [...]" or not, as the training hasn't been done yet?

TheLastBen commented 1 year ago

remove "gothoutfit-(27).txt" from the instance folder I haven't experimented with captions yet.

Jaobsz commented 1 year ago

Yeah no it doesn't change a thing, it takes another .txt and has the same error.

TheLastBen commented 1 year ago

did you manually upload the text files onto the instance_images folder ?

Jaobsz commented 1 year ago

Tried manually and it didn't change a thing. I guess I'll keep doing the normal method.

TheLastBen / fast-stable-diffusion

(Caption colab) PIL.UnidentifiedImageError: cannot identify image file #1104