Open BroctorDF opened 1 year ago
I'm sorry, I don't have any plans to continue the Fast Kohya Trainer project (for now).
The code is getting longer in one cell, and my device is not powerful enough to handle it, (I run most of things in small laptop), which is causing lag.
It's really hard to maintain the 1-click cell colab.
There is another good 1-click cell colab project maintained by HollowStrawBerry, which is integrated with Google Drive. It based on my notebook (?) so you may get the same experience, or even better.You can check it out here: https://github.com/hollowstrawberry/kohya-colab
Hi, I was trying to create a LORA but I keep getting this error
2023-03-20 04:40:56.630426: E tensorflow/stream_executor/cuda/cuda_blas.cc:2981] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered 2023-03-20 04:40:57.674789: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/lib64-nvidia 2023-03-20 04:40:57.675017: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/lib64-nvidia 2023-03-20 04:40:57.675048: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly. 2023-03-20 04:41:00.803315: E tensorflow/stream_executor/cuda/cuda_blas.cc:2981] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered 2023-03-20 04:41:01.503107: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/lib64-nvidia 2023-03-20 04:41:01.503212: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/lib64-nvidia 2023-03-20 04:41:01.503248: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly. prepare tokenizer update token length: 225 Train with captions. loading existing metadata: /content/drive/MyDrive/training_dir/meta_lat.json Traceback (most recent call last): File "/content/kohya-trainer/train_network.py", line 663, in
train(args)
File "/content/kohya-trainer/train_network.py", line 94, in train
train_dataset_group = config_util.generate_dataset_group_by_blueprint(blueprint.dataset_group)
File "/content/kohya-trainer/library/config_util.py", line 368, in generate_dataset_group_by_blueprint
dataset = dataset_klass(subsets=subsets, **asdict(dataset_blueprint.params))
File "/content/kohya-trainer/library/train_util.py", line 911, in init
assert len(abs_path) >= 1, f"no image / 画像がありません: {image_key}"
AssertionError: no image / 画像がありません: 10
Traceback (most recent call last):
File "/usr/local/bin/accelerate", line 8, in
sys.exit(main())
File "/usr/local/lib/python3.9/dist-packages/accelerate/commands/accelerate_cli.py", line 45, in main
args.func(args)
File "/usr/local/lib/python3.9/dist-packages/accelerate/commands/launch.py", line 1104, in launch_command
simple_launcher(args)
File "/usr/local/lib/python3.9/dist-packages/accelerate/commands/launch.py", line 567, in simple_launcher
raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['/usr/bin/python3', '/content/kohya-trainer/train_network.py', '--output_name=adrcl', '--pretrained_model_name_or_path=/content/pretrained_model/anything-v3-fp32-pruned.safetensors', '--vae=/content/vae/anime.vae.pt', '--train_data_dir=/content/drive/MyDrive/ADR/5_adrcll', '--in_json=/content/drive/MyDrive/training_dir/meta_lat.json', '--output_dir=/content/drive/MyDrive/training_dir/output', '--network_dim=128', '--network_alpha=128', '--network_module=networks.lora', '--unet_lr=0.0001', '--text_encoder_lr=5e-05', '--optimizer_type=AdamW8bit', '--learning_rate=2e-06', '--lr_scheduler=constant', '--lr_warmup_steps=250', '--dataset_repeats=10', '--resolution=512', '--keep_tokens=1', '--lowram', '--mixed_precision=fp16', '--save_precision=fp16', '--save_n_epoch_ratio=3', '--save_model_as=safetensors', '--train_batch_size=4', '--max_token_length=225', '--max_train_epochs=20', '--clip_skip=2', '--logging_dir=/content/training_dir/logs', '--log_prefix=adrcl', '--shuffle_caption', '--xformers']' returned non-zero exit status 1.
how do I solve it?... I'm not a programmer so, please I'd apreciate if you could explin like if I were a 5 yo.