huggingface / diffusers

πŸ€— Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.
https://huggingface.co/docs/diffusers
Apache License 2.0
26.42k stars 5.44k forks source link

πŸ™‹Flexible Directory Structure Support for Custom Pipeline Scenarios #10041

Open townwish4git opened 3 days ago

townwish4git commented 3 days ago

🚨 Flexible Directory Structure for Custom Pipeline Scenarios Raised Error!

When I have defined some custom_pipelines in local and the directory structure looks like:

custom_pipelines
β”œβ”€β”€ custom_modules
β”‚   β”œβ”€β”€ autoencoders
β”‚   β”‚   └── vae_a.py
β”‚   └── transformers
β”‚       β”œβ”€β”€ transformer_a.py
β”‚       └── transformer_b.py
β”œβ”€β”€ pipeline_a.py
└── pipeline_b.py

then I try to use custom_pipeline like:

import torch
from diffusers import StableDiffusion3Pipeline

pipe = StableDiffusion3Pipeline.from_pretrained(
    "stabilityai/stable-diffusion-3-medium-diffusers",
    custom_pipeline="/path/to/my_custom_pipeline.py",
)

It raises error like:

> python ****/playground/test_pipes/custom_pipelines/test.py
Traceback (most recent call last):
  File "****/playground/test_pipes/custom_pipelines/test.py", line 4, in <module>
    pipe = StableDiffusion3Pipeline.from_pretrained(
  File "****/lib/python3.8/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
  File "****/diffusers/src/diffusers/pipelines/pipeline_utils.py", line 699, in from_pretrained
    cached_folder = cls.download(
  File "****/lib/python3.8/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
  File "****/diffusers/src/diffusers/pipelines/pipeline_utils.py", line 1402, in download
    pipeline_class = _get_pipeline_class(
  File "****/diffusers/src/diffusers/pipelines/pipeline_loading_utils.py", line 347, in _get_pipeline_class
    return _get_custom_pipeline_class(
  File "****/diffusers/src/diffusers/pipelines/pipeline_loading_utils.py", line 326, in _get_custom_pipeline_class
    return get_class_from_dynamic_module(
  File "****/lib/python3.8/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
  File "****/diffusers/src/diffusers/utils/dynamic_modules_utils.py", line 447, in get_class_from_dynamic_module
    final_module = get_cached_module_file(
  File "****/lib/python3.8/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
  File "****/diffusers/src/diffusers/utils/dynamic_modules_utils.py", line 336, in get_cached_module_file
    shutil.copy(os.path.join(pretrained_model_name_or_path, module_needed), submodule_path / module_needed)
  File "****/lib/python3.8/shutil.py", line 409, in copy
    copyfile(src, dst, follow_symlinks=follow_symlinks)
  File "****/lib/python3.8/shutil.py", line 259, in copyfile
    with open(src, 'rb') as fsrc, open(dst, 'wb') as fdst:
FileNotFoundError: [Errno 2] No such file or directory: '****/playground/test_pipes/custom_pipelines/custom_modules.transformers.transformer_sd3.py'

πŸ•΅πŸ»β€β™‚οΈ Reason for Error

It raised error because all files involved in relative imports within custom_pipelines.py will be copied to to subfolder of HF_MODULES_CACHE, and the corresponding code for this copy logic is:

    # Check we have all the requirements in our environment
    modules_needed = check_imports(resolved_module_file)

    # Now we move the module inside our cached dynamic modules.
    full_submodule = DIFFUSERS_DYNAMIC_MODULE_NAME + os.path.sep + submodule
    create_dynamic_module(full_submodule)
    submodule_path = Path(HF_MODULES_CACHE) / full_submodule
    if submodule == "local" or submodule == "git":
        # We always copy local files (we could hash the file to see if there was a change, and give them the name of
        # that hash, to only copy when there is a modification but it seems overkill for now).
        # The only reason we do the copy is to avoid putting too many folders in sys.path.
        shutil.copyfile(resolved_module_file, submodule_path / module_file)
        for module_needed in modules_needed:
            if len(module_needed.split(".")) == 2:
                module_needed = "/".join(module_needed.split("."))
                module_folder = module_needed.split("/")[0]
                if not os.path.exists(submodule_path / module_folder):
                    os.makedirs(submodule_path / module_folder)
            module_needed = f"{module_needed}.py"
            shutil.copyfile(os.path.join(pretrained_model_name_or_path, module_needed), submodule_path / module_needed)

In my case, my custom pipeline has relative imports like:

from .custom_modules.transformers.transformer_sd3 import funcs_i_need

it will be added to modules_needed as 'custom_modules.transformers.transformer_sd3', and in loop for module_needed in modules_needed, it will miss the process in if len(module_needed.split(".")) == 2 because there are 3! Then shutil.copyfile will try to copy '****/custom_pipelines/custom_modules.transformers.transformer_sd3.py' which is NOT a path at all and raise error!

πŸ‘€ Might be a solution

Maybe we could do a minimal revision here to support more flexible directory structure here:

if submodule == "local" or submodule == "git":
        # We always copy local files (we could hash the file to see if there was a change, and give them the name of
        # that hash, to only copy when there is a modification but it seems overkill for now).
        # The only reason we do the copy is to avoid putting too many folders in sys.path.
        shutil.copyfile(resolved_module_file, submodule_path / module_file)
        for module_needed in modules_needed:
-            if len(module_needed.split(".")) == 2:
+            if len(module_needed.split(".")) >= 2:
                module_needed = "/".join(module_needed.split("."))
-                module_folder = module_needed.split("/")[0]
+               module_folder = module_needed.rsplit("/", 1)[0]
                if not os.path.exists(submodule_path / module_folder):
                    os.makedirs(submodule_path / module_folder)
            module_needed = f"{module_needed}.py"
            shutil.copyfile(os.path.join(pretrained_model_name_or_path, module_needed), submodule_path / module_needed)

WDYT @yiyixuxu @sayakpaul @DN6 , happy to hear your suggestions☺️

sayakpaul commented 3 days ago

This could be nice indeed. Cc: @a-r-r-o-w as well.