hollowstrawberry / kohya-colab

Accessible Google Colab notebooks for Stable Diffusion Lora training, based on the work of kohya-ss and Linaqruf
GNU General Public License v3.0
621 stars 93 forks source link

v-pred model training #228

Open AriannaHeartbell opened 1 month ago

AriannaHeartbell commented 1 month ago

I'm encountering an error when trying to train using a recently uploaded model on Civitai (https://civitai.com/models/833294?modelVersionId=968495). The following error occurs:

Traceback (most recent call last):
  File "/content/kohya-trainer/train_network_xl_wrapper.py", line 10, in <module>
    trainer.train(args)
  File "/content/kohya-trainer/train_network.py", line 213, in train
    model_version, text_encoder, vae, unet = self.load_target_model(args, weight_dtype, accelerator)
  File "/content/kohya-trainer/sdxl_train_network.py", line 34, in load_target_model
    ) = sdxl_train_util.load_target_model(args, accelerator, sdxl_model_util.MODEL_VERSION_SDXL_BASE_V0_9, weight_dtype)
  File "/content/kohya-trainer/library/sdxl_train_util.py", line 33, in load_target_model
    ) = *load*target_model(
  File "/content/kohya-trainer/library/sdxl_train_util.py", line 70, in *load*target_model
    ) = sdxl_model_util.load_models_from_sdxl_checkpoint(model_version, name_or_path, device, weight_dtype)
  File "/content/kohya-trainer/library/sdxl_model_util.py", line 260, in load_models_from_sdxl_checkpoint
    info1 = text_model1.load_state_dict(te1_sd)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 2215, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for CLIPTextModel:
    Missing key(s) in state_dict: "text_model.embeddings.position_ids".

Meanwhile, when using same settng but load_diffusers for a diffusers version of the model, I get the following error:

Traceback (most recent call last):
  File "/content/kohya-trainer/train_network_xl_wrapper.py", line 10, in <module>
    trainer.train(args)
  File "/content/kohya-trainer/train_network.py", line 213, in train
    model_version, text_encoder, vae, unet = self.load_target_model(args, weight_dtype, accelerator)
  File "/content/kohya-trainer/sdxl_train_network.py", line 34, in load_target_model
    ) = sdxl_train_util.load_target_model(args, accelerator, sdxl_model_util.MODEL_VERSION_SDXL_BASE_V0_9, weight_dtype)
  File "/content/kohya-trainer/library/sdxl_train_util.py", line 33, in load_target_model
    ) = loadtarget_model(
  File "/content/kohya-trainer/library/sdxl_train_util.py", line 79, in loadtarget_model
    pipe = StableDiffusionXLPipeline.from_pretrained(name_or_path, torch_dtype=weight_dtype, variant=variant, tokenizer=None)
  File "/usr/local/lib/python3.10/dist-packages/diffusers/pipelines/pipeline_utils.py", line 1253, in from_pretrained
    loaded_sub_model = load_sub_model(
  File "/usr/local/lib/python3.10/dist-packages/diffusers/pipelines/pipeline_utils.py", line 418, in load_sub_model
    class_obj, class_candidates = get_class_obj_and_candidates(
  File "/usr/local/lib/python3.10/dist-packages/diffusers/pipelines/pipeline_utils.py", line 326, in get_class_obj_and_candidates
    class_obj = getattr(library, class_name)
  File "/usr/local/lib/python3.10/dist-packages/diffusers/utils/import_utils.py", line 703, in getattr
    raise AttributeError(f"module {self.name} has no attribute {name}")
AttributeError: module diffusers has no attribute EDMDPMSolverMultistepScheduler

The above warning seems to have been resolved by upgrading diffusers to the latest version. However, I also get this warning I don't know how to handle:

The deprecation tuple ('no variant default', '0.24.0', 
"You are trying to load the model files of the variant=fp16, but no such modeling files are available. 
The default model files: 
{
    'text_encoder/model.safetensors', 
    'text_encoder_2/model.safetensors', 
    'vae/diffusion_pytorch_model.safetensors', 
    'unet/diffusion_pytorch_model.safetensors'
} 
will be loaded instead. Make sure to not load from variant=fp16 if such variant modeling files are not available. 
Doing so will lead to an error in v0.24.0 as defaulting to non-variant modeling files is deprecated.") 

( I tried diffusers version into 0.23.0 but I faced above EDMPMSolver~ issue)

Other sdxl models or non-v-prediction versions of this model (https://civitai.com/models/833294?modelVersionId=932238) train without any issues.

hollowstrawberry commented 1 month ago

I should investigate the parameters needed for a v-prediction model

832r329rvuj20r2 commented 1 month ago

https://civitai.com/models/282341?modelVersionId=826474 Same error in this model

LilBingusarus commented 4 weeks ago

Are there any updates for this issue? I'm trying to train on the same model the first person was using and I'm having the same issue

hollowstrawberry commented 1 week ago

It appears v-pred is out of reach for now, as it's a newer feature and this colab is kept intentionally on an old version (due to memory issues we experienced when trying to upgrade).

You should try Jelo's colab which has its own guide. It's a bit more complicated, notably you need to connect to it using a free program on your own computer. If it doesn't support v-pred models yet, it should support them soon.

AriannaHeartbell commented 1 week ago

I apologize for bringing up an issue and then completely forgetting about it, since upgraded epsilon 1.0 version of the model released and it is well worked with your colab notebook. However, I’d like to make a small correction for the benefit of users like me, who might not be as familiar with programming topics but are reading this thread: The model I said is not a v-pred model. I later learned that "epsilon" refers to 'error', meaning it uses errors in the same way as a typical diffusion model. So the issue mentioned here doesn’t seem to be related to v-prediction.

AIuser0101 commented 1 week ago

ah, this explains why every time I tired v-pred models it kept crashing. Thanks again for updating the notebook!