kohya-ss / sd-scripts

Apache License 2.0
4.86k stars 810 forks source link

SDXL training crashes when trying to use fp16 and model is not defined as variant model #1420

Open omgold opened 1 month ago

omgold commented 1 month ago

When using --mixed_precision=fp16 for SDXL training and the model files are not named as expected I get a ValueError thrown:

Traceback (most recent call last):                                                                                                                                            
  File "/opt/ai/kohya/sdxl_train.py", line 818, in <module>                                                                                                                   
    train(args)                                                                                                                                                               
  File "/opt/ai/kohya/sdxl_train.py", line 217, in train                                                                                                                      
    ) = sdxl_train_util.load_target_model(args, accelerator, "sdxl", weight_dtype)                                                                                            
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^                                                                                            
  File "/opt/ai/kohya/library/sdxl_train_util.py", line 40, in load_target_model                                                                                              
    ) = _load_target_model(                                                                                                                                                   
        ^^^^^^^^^^^^^^^^^^^                                                                                                                                                   
  File "/opt/ai/kohya/library/sdxl_train_util.py", line 87, in _load_target_model                                                                                             
    pipe = StableDiffusionXLPipeline.from_pretrained(                                                                                                                         
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^                                                                                                                         
  File "/opt/ai-venv/lib/python3.11/site-packages/huggingface_hub/utils/_validators.py", line 118, in _inner_fn                                                               
    return fn(*args, **kwargs)                                                                                                                                                
           ^^^^^^^^^^^^^^^^^^^                                                                                                                                                
  File "/opt/ai-venv/lib/python3.11/site-packages/diffusers/pipelines/pipeline_utils.py", line 1096, in from_pretrained                                                       
    cached_folder = cls.download(                                                                                                                                             
                    ^^^^^^^^^^^^^                                                                                                                                             
  File "/opt/ai-venv/lib/python3.11/site-packages/huggingface_hub/utils/_validators.py", line 118, in _inner_fn                                                               
    return fn(*args, **kwargs)                                                                                                                                                
           ^^^^^^^^^^^^^^^^^^^
  File "/opt/ai-venv/lib/python3.11/site-packages/diffusers/pipelines/pipeline_utils.py", line 1711, in download                                                              
    deprecate("no variant default", "0.24.0", deprecation_message, standard_warn=False)                                                                                       
  File "/opt/ai-venv/lib/python3.11/site-packages/diffusers/utils/deprecation_utils.py", line 18, in deprecate                                                                
    raise ValueError(                                                                  
ValueError: The deprecation tuple ('no variant default', '0.24.0', "You are trying to load the model files of the `variant=fp16`, but no such modeling files are available.The
 default model files: {'vae/diffusion_pytorch_model.bin', 'text_encoder_2/pytorch_model.bin', 'text_encoder/pytorch_model.bin', 'unet/diffusion_pytorch_model.bin'} will be lo
aded instead. Make sure to not load from `variant=fp16`if such variant modeling files are not available. Doing so will lead to an error in v0.24.0 as defaulting to non-varian
tmodeling files is deprecated.") should be removed since diffusers' version 0.25.0 is >= 0.24.0

Apparently in library/sdxl_train_util.py the code expects an EnvironmentError in that case:

     81         from diffusers import StableDiffusionXLPipeline
     82 
     83         variant = "fp16" if weight_dtype == torch.float16 else None
     84         logger.info(f"load Diffusers pretrained models: {name_or_path}, variant={variant}")
     85         try:
     86             try:
     87                 pipe = StableDiffusionXLPipeline.from_pretrained(
     88                     name_or_path, torch_dtype=model_dtype, variant=variant, tokenizer=None
     89                 )
     90             except EnvironmentError as ex:
     91                 if variant is not None:
     92                     logger.info("try to load fp32 model")
     93                     pipe = StableDiffusionXLPipeline.from_pretrained(name_or_path, variant=None, tokenizer=None)
     94                 else:
     95                     raise ex

When I change the except in line 90 to also catch a ValueError, it works for me.

kohya-ss commented 1 month ago

Thank you! Diffusers seems to change the behavior... I will fix it sonner.