Linaqruf / kohya-trainer

Adapted from https://note.com/kohya_ss/n/nbf7ce8d80f29 for easier cloning
Apache License 2.0
1.83k stars 300 forks source link

Image data/directories setup issue/json example #241

Open by321 opened 1 year ago

by321 commented 1 year ago

Hi,

I read the translated English docs, and I'm still not sure how I should setup my training images and config files.

I'm trying to train a lora with 4 concepts, and currently, my images are in 4 directories. For each iamge, I already made a .txt file with tags or description of the image. I'd prefer not to use class tokens (for example "shsdog") if possible.

It seems to me I should build a fine-tuning style dataset. My TOML file should look something like this:

[general]
shuffle_caption = false
caption_extension = '.txt'
keep_tokens = 1

# This is a fine-tuning-style dataset
[[datasets]]
resolution = [512, 512]
batch_size = 2

  [[datasets.subsets]]
  image_dir = 'C:\training_images\concept1'
  metadata_file = 'C:\training_images\concept1.json'
  flip_aug=true
  random_crop=true

  [[datasets.subsets]]
  image_dir = 'C:\training_images\concept2'
  metadata_file = 'C:\training_images\concept2.json'
  flip_aug=true
  random_crop=true

  [[datasets.subsets]]
  image_dir = 'C:\training_images\concept3'
  metadata_file = 'C:\training_images\concept3.json'
  flip_aug=true
  random_crop=true

  [[datasets.subsets]]
  image_dir = 'C:\training_images\concept4'
  metadata_file = 'C:\training_images\concept4.json'
  flip_aug=true
  random_crop=true

Is this correct ? Is so, how should I create my .json files ? I couldn't find an example of this.

Linaqruf commented 1 year ago

Yes, but I'm not using new dataset config for fine tuning notebook so I can't help much.