kohya-ss / sd-scripts

Apache License 2.0
5.26k stars 874 forks source link

google colabでloraなどの学習ができない I can't learn lora and others in google colab. #1191

Open StupidGame opened 8 months ago

StupidGame commented 8 months ago

google colabにて以下のセルでloraの学習を実行しようとしたところ、以下のエラーが出ました。 I tried to run the lora study in google colab with the following cell and got the following error.

`# Google Driveをマウント from google.colab import drive drive.mount('/content/drive') token = "GITHUB_TOKEN" #@param {type:"string"} git_user_name = "StupidGame" #@param {type:"string"} git_repo_path = "StupidGame/datasets_repo" #@param {type:"string"} git_repo_url = "https://" + git_user_name + ":" + token + "@github.com/" + git_repo_path + ".git"

@title

残りのパッケージインストール

@markdown * データセットがあるリポジトリのURL

!git clone $git_repo_url !git clone https://github.com/kohya-ss/sd-scripts.git %cd sd-scripts !pip install -r requirements.txt %cd .. !pip install dadaptation !pip install prodigyopt !pip install lion_pytorch !pip install wandb !pip install xformers

!pip install protobuf==3.20.3

from accelerate.utils import write_basic_config write_basic_config() pretrained_model_name_or_path = "StupidGame/AnyLoRA" #@param {type:"string"}

@markdown * 学習元のDiffusersモデル、ckptどちらかの保存先を入力してください。

pretrained_model_is_v2 = False #@param {type:"boolean"}

@markdown * 学習元のモデルがSDv2派生かどうか入力してください、

pretrained_model_resolution = "512x512" #@param ["512x512", "768x768"]

@markdown * 学習元のモデルの学習サイズを選択してください

@markdown ----

datasets_path = "/content/kohya-mydatasets/datasets/reina_v2.toml" #@param {type:"string"} prompts_path = "/content/kohya-mydatasets/datasets/reina_v2.txt" #@param {type:"string"} dream_booth_epochs = 25 #@param {type:"integer"}

@markdown * 学習にかけるステップ数です

@markdown * 元のDiffusers版やXavierXiao氏のStableDiffusion版とほぼ同じ学習を行うには、ステップ数を倍にしてください。

@markdown ----

learning_late = 2.4e-4 #@param {type:"number"}

@markdown * 学習率です

@markdown ----

dream_booth_model_ext = "safetensors" #@param ["pt", "ckpt", "safetensors"]

@markdown * 保存する形式を指定してください。

dream_booth_new_model = "reina_lora_test"#@param {type:"string"}

@markdown * 保存するファイル / フォルダーの名前を指定してください。

@markdown ----

output_dir = "/content/drive/MyDrive/loras" #@param{type:"string"}

@markdown * 保存するファイル / フォルダーの場所を指定してください。

@markdown ----

network_dim = 64 #@param {type:"integer"}

@markdown * 次元数

@markdown ----

network_alpha = 1 #@param {type:"integer"}

@markdown * しきい値

@markdown ----

te_coef = 0.5 #@param {type:"number"}

@markdown * テキストエンコーダーの学習率の係数

@markdown ----

unet_coef = 1 #@param {type:"number"}

@markdown * unetの学習率の係数

@markdown ----

shutdown = True #@param {type:"boolean"}

@markdown ----

dream_booth_new_model = dream_booth_newmodel + "" + str(learning_late) output_dir = output_dir + "/" + dream_booth_new_model conv_dim = "conv_dim=" + str(network_dim) conv_alpha = "conv_alpha=" + str(network_alpha) import os import glob import shutil os.makedirs("output", exist_ok=True) text_lr = learning_late te_coef unet_lr = learning_late unet_coef !accelerate launch --num_cpu_threads_per_process 12 sd-scripts/train_network.py \ --pretrained_model_name_or_path=$pretrained_model_name_or_path \ --dataset_config=$datasets_path \ --sample_prompts=$prompts_path \ --network_dim=$network_dim \ --network_alpha=$network_alpha \ --output_dir=$output_dir \ --lr_scheduler="cosine_with_restarts" \ --lr_scheduler_num_cycles=2 \ --text_encoder_lr=$text_lr \ --unet_lr=$unet_lr \ --output_name=$dream_booth_new_model \ --prior_loss_weight=1.0 \ --seed=42 \ --sample_sampler="k_euler_a" \ --max_train_epochs=$dream_booth_epochs \ --optimizer_type="Lion" \ --optimizer_args "weight_decay=1e-1" "betas=0.9, 0.99" \ --mixed_precision='fp16' \ --xformers \ --gradient_checkpointing \ --save_precision='fp16' \ --sample_every_n_epochs=1 \ --save_model_as=$dream_booth_model_ext \ --cache_latents \ --bucket_no_upscale \ --log_with="wandb" \ --wandb_api_key="WANDB_TOKEN" \ --network_module=networks.lora \ --network_args $conv_dim $conv_alpha \ --max_token_length=150 \ --logging_dir=logs \ --noise_offset=0.05 \ --adaptive_noise_scale=0.05 \ --clip_skip=2 \ --min_snr_gamma=5 if shutdown == True: from google.colab import runtime runtime.unassign()` Error Description

/usr/local/lib/python3.10/dist-packages/transformers/utils/generic.py:441: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead. _torch_pytree._register_pytree_node( /usr/local/lib/python3.10/dist-packages/transformers/utils/generic.py:309: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead. _torch_pytree._register_pytree_node( /usr/local/lib/python3.10/dist-packages/transformers/utils/generic.py:441: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead. _torch_pytree._register_pytree_node( /usr/local/lib/python3.10/dist-packages/transformers/utils/generic.py:309: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead. _torch_pytree._register_pytree_node( /usr/local/lib/python3.10/dist-packages/transformers/utils/generic.py:441: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead. _torch_pytree._register_pytree_node( /usr/local/lib/python3.10/dist-packages/transformers/utils/generic.py:309: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead. _torch_pytree._register_pytree_node( /usr/local/lib/python3.10/dist-packages/diffusers/utils/outputs.py:63: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead. torch.utils._pytree._register_pytree_node( /usr/local/lib/python3.10/dist-packages/diffusers/utils/outputs.py:63: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead. torch.utils._pytree._register_pytree_node( /usr/local/lib/python3.10/dist-packages/diffusers/utils/outputs.py:63: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead. torch.utils._pytree._register_pytree_node( Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/transformers/utils/import_utils.py", line 1382, in _get_module return importlib.import_module("." + module_name, self.__name__) File "/usr/lib/python3.10/importlib/__init__.py", line 126, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "<frozen importlib._bootstrap>", line 1050, in _gcd_import File "<frozen importlib._bootstrap>", line 1027, in _find_and_load File "<frozen importlib._bootstrap>", line 1006, in _find_and_load_unlocked File "<frozen importlib._bootstrap>", line 688, in _load_unlocked File "<frozen importlib._bootstrap_external>", line 883, in exec_module File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed File "/usr/local/lib/python3.10/dist-packages/transformers/models/clip/image_processing_clip.py", line 21, in <module> from ...image_processing_utils import BaseImageProcessor, BatchFeature, get_size_dict File "/usr/local/lib/python3.10/dist-packages/transformers/image_processing_utils.py", line 28, in <module> from .image_transforms import center_crop, normalize, rescale File "/usr/local/lib/python3.10/dist-packages/transformers/image_transforms.py", line 47, in <module> import tensorflow as tf File "/usr/local/lib/python3.10/dist-packages/tensorflow/__init__.py", line 48, in <module> from tensorflow._api.v2 import __internal__ File "/usr/local/lib/python3.10/dist-packages/tensorflow/_api/v2/__internal__/__init__.py", line 8, in <module> from tensorflow._api.v2.__internal__ import autograph File "/usr/local/lib/python3.10/dist-packages/tensorflow/_api/v2/__internal__/autograph/__init__.py", line 8, in <module> from tensorflow.python.autograph.core.ag_ctx import control_status_ctx # line: 34 File "/usr/local/lib/python3.10/dist-packages/tensorflow/python/autograph/core/ag_ctx.py", line 21, in <module> from tensorflow.python.autograph.utils import ag_logging File "/usr/local/lib/python3.10/dist-packages/tensorflow/python/autograph/utils/__init__.py", line 17, in <module> from tensorflow.python.autograph.utils.context_managers import control_dependency_on_returns File "/usr/local/lib/python3.10/dist-packages/tensorflow/python/autograph/utils/context_managers.py", line 19, in <module> from tensorflow.python.framework import ops File "/usr/local/lib/python3.10/dist-packages/tensorflow/python/framework/ops.py", line 29, in <module> from tensorflow.core.framework import attr_value_pb2 File "/usr/local/lib/python3.10/dist-packages/tensorflow/core/framework/attr_value_pb2.py", line 5, in <module> from google.protobuf.internal import builder as _builder ImportError: cannot import name 'builder' from 'google.protobuf.internal' (/usr/local/lib/python3.10/dist-packages/google/protobuf/internal/__init__.py) The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/diffusers/utils/import_utils.py", line 710, in _get_module return importlib.import_module("." + module_name, self.__name__) File "/usr/lib/python3.10/importlib/__init__.py", line 126, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "<frozen importlib._bootstrap>", line 1050, in _gcd_import File "<frozen importlib._bootstrap>", line 1027, in _find_and_load File "<frozen importlib._bootstrap>", line 1006, in _find_and_load_unlocked File "<frozen importlib._bootstrap>", line 688, in _load_unlocked File "<frozen importlib._bootstrap_external>", line 883, in exec_module File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed File "/usr/local/lib/python3.10/dist-packages/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py", line 20, in <module> from transformers import CLIPImageProcessor, CLIPTextModel, CLIPTokenizer, CLIPVisionModelWithProjection File "<frozen importlib._bootstrap>", line 1075, in _handle_fromlist File "/usr/local/lib/python3.10/dist-packages/transformers/utils/import_utils.py", line 1373, in __getattr__ value = getattr(module, name) File "/usr/local/lib/python3.10/dist-packages/transformers/utils/import_utils.py", line 1372, in __getattr__ module = self._get_module(self._class_to_module[name]) File "/usr/local/lib/python3.10/dist-packages/transformers/utils/import_utils.py", line 1384, in _get_module raise RuntimeError( RuntimeError: Failed to import transformers.models.clip.image_processing_clip because of the following error (look up to see its traceback): cannot import name 'builder' from 'google.protobuf.internal' (/usr/local/lib/python3.10/dist-packages/google/protobuf/internal/__init__.py) The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/content/sd-scripts/train_network.py", line 22, in <module> from library import model_util File "/content/sd-scripts/library/model_util.py", line 13, in <module> from diffusers import AutoencoderKL, DDIMScheduler, StableDiffusionPipeline # , UNet2DConditionModel File "<frozen importlib._bootstrap>", line 1075, in _handle_fromlist File "/usr/local/lib/python3.10/dist-packages/diffusers/utils/import_utils.py", line 701, in __getattr__ value = getattr(module, name) File "/usr/local/lib/python3.10/dist-packages/diffusers/utils/import_utils.py", line 701, in __getattr__ value = getattr(module, name) File "/usr/local/lib/python3.10/dist-packages/diffusers/utils/import_utils.py", line 700, in __getattr__ module = self._get_module(self._class_to_module[name]) File "/usr/local/lib/python3.10/dist-packages/diffusers/utils/import_utils.py", line 712, in _get_module raise RuntimeError( RuntimeError: Failed to import diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion because of the following error (look up to see its traceback): Failed to import transformers.models.clip.image_processing_clip because of the following error (look up to see its traceback): cannot import name 'builder' from 'google.protobuf.internal' (/usr/local/lib/python3.10/dist-packages/google/protobuf/internal/__init__.py) Traceback (most recent call last): File "/usr/local/bin/accelerate", line 8, in <module> sys.exit(main()) File "/usr/local/lib/python3.10/dist-packages/accelerate/commands/accelerate_cli.py", line 47, in main args.func(args) File "/usr/local/lib/python3.10/dist-packages/accelerate/commands/launch.py", line 1017, in launch_command simple_launcher(args) File "/usr/local/lib/python3.10/dist-packages/accelerate/commands/launch.py", line 637, in simple_launcher raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd) subprocess.CalledProcessError: Command '['/usr/bin/python3', 'sd-scripts/train_network.py', '--pretrained_model_name_or_path=StupidGame/AnyLoRA', '--dataset_config=/content/kohya-mydatasets/datasets/reina_v2.toml', '--sample_prompts=/content/kohya-mydatasets/datasets/reina_v2.txt', '--network_dim=64', '--network_alpha=1', '--output_dir=/content/drive/MyDrive/loras/reina_lora_sdxl_0.00024', '--lr_scheduler=cosine_with_restarts', '--lr_scheduler_num_cycles=2', '--text_encoder_lr=0.00012', '--unet_lr=0.00024', '--output_name=reina_lora_sdxl_0.00024', '--prior_loss_weight=1.0', '--seed=42', '--sample_sampler=k_euler_a', '--max_train_epochs=25', '--optimizer_type=Lion', '--optimizer_args', 'weight_decay=1e-1', 'betas=0.9, 0.99', '--mixed_precision=fp16', '--xformers', '--gradient_checkpointing', '--save_precision=fp16', '--sample_every_n_epochs=1', '--save_model_as=safetensors', '--cache_latents', '--bucket_no_upscale', '--log_with=wandb', '--wandb_api_key=WANDB_TOKEN', '--network_module=networks.lora', '--network_args', 'conv_dim=64', 'conv_alpha=1', '--max_token_length=150', '--logging_dir=logs', '--noise_offset=0.05', '--adaptive_noise_scale=0.05', '--clip_skip=2', '--min_snr_gamma=5']' returned non-zero exit status 1.

kohya-ss commented 8 months ago

Colabのnotebookは私が作ったものではありませんので、作者の方にご確認ください。なお、エラーメッセージによると、tensorflowとprotobufの関係でエラーが起きているようです。もし不要ならTensorFlowをアンインストールすると動くかもしれません。

I didn't make Colab notebook, so please contact to the author of the notebook.

According to the message, tensorflow or protobuf may cause the error. If you don't need tensorflow, uninstalling it may solve the issue.

StupidGame commented 8 months ago

ありがとうございます!試してみます!