cyber-meow / anime_screenshot_pipeline

A 99% automatized pipeline to construct training set from anime and more for text-to-image model training
MIT License
192 stars 11 forks source link

classifier_training train.py issue #7

Closed HumoristReoccupy closed 11 months ago

HumoristReoccupy commented 1 year ago

I've been trying to set up this pipeline and have ran into this output when running the train.py script for the Character Classification Training section.

C:\folder\anime_screenshot_pipeline\classifier_training> python train.py --transfer_learning --model_name L_16 --interm_features_fc --batch_size=8 --no_epochs 40 --dataset_path "C:\folder\anime_screenshot_pipeline\Training\training_test_output" --results_dir "C:\folder\anime_screenshot_pipeline\Training\Train Output" --checkpoint_path "C:\folder\anime_screenshot_pipeline\pretrain.ckpt"

Namespace(dataset_path='C:\\folder\\anime_screenshot_pipeline\\Training\\training_test_output', dataset_name='anime', model_name='L_16', results_dir='C:\\folder\\anime_screenshot_pipeline\\Training\\Train Output', image_size=128, batch_size=8, no_epochs=40, learning_rate=0.001, lr_scheduler='warmupCosine', epoch_decay=20, warmup_steps=1000, pretrained=False, checkpoint_path='C:\\folder\\anime_screenshot_pipeline\\pretrain.ckpt', transfer_learning=True, load_partial_mode=None, log_freq=10, save_checkpoint_freq=5, no_cpu_workers=4, seed=0, interm_features_fc=True, debugging=False, exclusion_loss=False, temperature=1.0, exclusion_weight=0.01, exc_layers_dist=2, multimodal=False, max_text_seq_len=None, mask_schedule=None, mask_wu_percent=0.1, mask_cd_percent=0.5, ret_attn_scores=False, tokenizer='tag', masking_behavior='constant', shuffle_tokens=False, label_csv_name='labels.csv', use_test_set=False, patch_size=16, run_name='anime_L_16_image128_batch8_SGDlr0.001_ptFalse_seed0_warmupCosine_interTrue_mmFalse_textLenNone_maskNoneconstanttagtokenizingshufFalse')
Missing keys when loading pretrained weights: ['inter_class_head.4.weight', 'inter_class_head.4.bias']
                    Expected missing keys: ['inter_class_head.4.weight', 'inter_class_head.4.bias']
Unexpected keys when loading pretrained weights: []
Loaded from custom checkpoint.
ViTConfigExtended {
  "attention_probs_dropout_prob": 0.0,
  "classifier": "token",
  "fh": 16,
  "fw": 16,
  "gh": 8,
  "gw": 8,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 1024,
  "image_size": 128,
  "initializer_range": 0.02,
  "intermediate_size": 4096,
  "layer_norm_eps": 1e-12,
  "load_fc_layer": false,
  "max_text_seq_len": null,
  "model_type": "vit",
  "num_attention_heads": 16,
  "num_channels": 3,
  "num_classes": 38,
  "num_hidden_layers": 24,
  "patch_size": [
    16,
    16
  ],
  "pos_embedding_type": "learned",
  "pretrained_image_size": 224,
  "pretrained_num_channels": 3,
  "pretrained_num_classes": 21843,
  "representation_size": 1024,
  "seq_len": 65,
  "transformers_version": "4.9.1",
  "vocab_size": 30522
}

Traceback (most recent call last):
  File "C:\folder\anime_screenshot_pipeline\classifier_training\train.py", line 486, in <module>
    main()
  File "C:\folder\anime_screenshot_pipeline\classifier_training\train.py", line 482, in main
    train_main(logger, args)
  File "C:\folder\anime_screenshot_pipeline\classifier_training\train.py", line 372, in train_main
    train_one_epoch(args, f, epoch, global_step, model, device, tokenizer,
  File "C:\folder\anime_screenshot_pipeline\classifier_training\train.py", line 133, in train_one_epoch
    for i, batch in enumerate(train_loader):
  File "C:\Users\username\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\utils\data\dataloader.py", line 628, in __next__
    data = self._next_data()
  File "C:\Users\username\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\utils\data\dataloader.py", line 1333, in _next_data
    return self._process_data(data)
  File "C:\Users\username\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\utils\data\dataloader.py", line 1359, in _process_data
    data.reraise()
  File "C:\Users\username\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\_utils.py", line 543, in reraise
    raise exception
FileNotFoundError: Caught FileNotFoundError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "C:\Users\username\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\utils\data\_utils\worker.py", line 302, in _worker_loop
    data = fetcher.fetch(index)
  File "C:\Users\username\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\utils\data\_utils\fetch.py", line 58, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "C:\Users\username\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\utils\data\_utils\fetch.py", line 58, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "C:\folder\anime_screenshot_pipeline\classifier_training\utilities\data_selection_customize.py", line 105, in __getitem__
    img = Image.open(img_dir)
  File "C:\Users\username\AppData\Roaming\Python\Python310\site-packages\PIL\Image.py", line 3227, in open
    fp = builtins.open(filename, "rb")
FileNotFoundError: [Errno 2] No such file or directory: 'C:\\folder\\anime_screenshot_pipeline\\Training\\training_test_output\\data\\class1\\image_14876_0.png'

(pretrain.ckpt is the danbooruFaces_L_16_image...False_lastEpoch.ckpt file suggested, the name was just too long when writing this up) (I am running this on Windows 10)

I've gone back and made sure I have all the required dependencies that needed to be installed. One of the odd things that I noticed is that it is adding a \data\ subfolder in the path that should not be there. Removing the image and recompiling the labels.csv just made the script stop at a different image on a different class so what I did was just left one class and it errored on every single image until it had too few and prompted the error asking for more data images.

On a side note, is a wandb API required? It was originally asking for an api_key and while I did sign up and get one, the result didn't change so I just ran wandb disabled during the troubleshooting process so it would stop appearing and would reenable when testing.

cyber-meow commented 1 year ago

Hello, Thank you for your question. The class images should be put in the data folder. Sorry for not being clear in readme. I will update it when I have time. I guess the original script just uses wandb by default but disabling it is fine. Please let me know if you have any further question.

HumoristReoccupy commented 1 year ago

That was a fast reply. Thanks. I made the separate \data\ subfolder in the classification_data_dir and moved the class subfolders into it and it seem to do the trick.

I originally did try this but ran into an error so I wasn't sure at the time if this is what I had to do. Now that I know that was supposed to be the intended organization, the error I was getting was:

File "C:\Users\usernmae\AppData\Local\Programs\Python\Python310\lib\site-packages\einops\_backends.py", line 513, in is_appropriate_type
    return self.K.is_tensor(tensor) and self.K.is_keras_tensor(tensor)
AttributeError: module 'keras.backend' has no attribute 'is_tensor'. Did you mean: '_to_tensor'?

The error was coming from the einops-0.3.0 package installed from the classifier_training\requirements.txt file. Checking einops' Github showed that it was a known issue and updating it to the latest version, einops-0.6.0, resolves the issue.

I went ahead and made sure I had no further issues before confirming and was able to train the new vision model successfully and ran the generated classify data it produced during the folder arrangement step successfully.

The only other question I have is how to organize characters that do not play nice with the face detection due to uncommon facial structures or if hoods or blindfolds hide the face in certain shots, both training data for the classifying script or the frames being classified, as they will be flagged for "0 faces detected" and skipped?

cyber-meow commented 1 year ago

Glad to hear that this is working for you now. The dependency issue is always tricky and you are right I am also using einops-0.6.0. I will keep these in mind when I update the repo.

Concerning the face detection part. I acknowledge this is the current bottleneck and I think it would be the bottleneck for any similar workflow. I also have questions detecting some characters when I trained my models. With the current models we can only fix this manually, but hopefully we can get a better detection model (for probably full body + head) in the near future. (I will not have time for that and I don't know if I will be able to urge some friend to do it.)

HumoristReoccupy commented 1 year ago

I see, I guess then for manual fixes would to be move images into their respective folders if they were listed as unknowns and run the metadata correcting script like suggested when dealing with character's backsides, and any character not tag from the beginning due to the limitations are SoL until improvements are found?

For now I can work with this as this pipeline already helps my work flow a lot. Thanks again for your help.

enranime commented 1 year ago

also I don't know if just me but if anyone have problem with extract frame ffmpeg script just remove the " ' " (quote) part of -i and filter parameter otherwise ffmpeg can't find your path

cyber-meow commented 11 months ago

The new version should work with back shot as well when --crop_with_face is not specified. The way of calling ffmpege command has also been modified so I think I can close the issue now.