How can I make diffuser pipeline to use .safetensors file for SDXL?

FurkanGozukara commented 1 year ago

Cloning entire repo is taking 100 GB

How can I make below code to use .safetensors file instead of diffusers?

Lets say I have downloaded my safetensors file into path.safetensors

How to provide it?

The below code working but we are cloning 100 GB instead of just single 14 GB safetensors. Waste of bandwidth

Also how can I add a LoRA checkpoint to this pipeline? a LoRA checkpoint made by Kohya script

import gradio as gr

from diffusers import DiffusionPipeline
import torch

import base64
from io import BytesIO
import os
import gc
from datetime import datetime

from share_btn import community_icon_html, loading_icon_html, share_js

# SDXL code: https://github.com/huggingface/diffusers/pull/3859

model_dir = '/workspace'
access_token = os.getenv("ACCESS_TOKEN")

if model_dir:
    # Use local model
    model_key_base = os.path.join(model_dir, "stable-diffusion-xl-base-0.9")
    model_key_refiner = os.path.join(model_dir, "stable-diffusion-xl-refiner-0.9")
else:
    model_key_base = "stabilityai/stable-diffusion-xl-base-0.9"
    model_key_refiner = "stabilityai/stable-diffusion-xl-refiner-0.9"

# Use refiner (enabled by default)
enable_refiner = os.getenv("ENABLE_REFINER", "true").lower() == "true"
# Output images before the refiner and after the refiner
output_images_before_refiner = True

# Create public link
share = os.getenv("SHARE", "false").lower() == "true"

print("Loading model", model_key_base)
pipe = DiffusionPipeline.from_pretrained(model_key_base, torch_dtype=torch.float16, use_auth_token=access_token)

#pipe.enable_model_cpu_offload()
pipe.to("cuda")

# if using torch < 2.0
pipe.enable_xformers_memory_efficient_attention()

# pipe.unet = torch.compile(pipe.unet, mode="reduce-overhead", fullgraph=True)

if enable_refiner:
    print("Loading model", model_key_refiner)
    pipe_refiner = DiffusionPipeline.from_pretrained(model_key_refiner, torch_dtype=torch.float16, use_auth_token=access_token)
    #pipe_refiner.enable_model_cpu_offload()
    pipe_refiner.to("cuda")

    # if using torch < 2.0
    pipe_refiner.enable_xformers_memory_efficient_attention()

    # pipe_refiner.unet = torch.compile(pipe_refiner.unet, mode="reduce-overhead", fullgraph=True)

# NOTE: we do not have word list filtering in this gradio demo

is_gpu_busy = False

def infer(prompt, negative, scale, samples=4, steps=50, refiner_strength=0.3, num_images=1):
    prompt, negative = [prompt] * samples, [negative] * samples
    images_b64_list = []

    for i in range(0, num_images):
        images = pipe(prompt=prompt, negative_prompt=negative, guidance_scale=scale, num_inference_steps=steps).images
        os.makedirs(r"stable-diffusion-xl-demo/outputs", exist_ok=True)
        gc.collect()
        torch.cuda.empty_cache()

        if enable_refiner:
            if output_images_before_refiner:
                for image in images:
                    buffered = BytesIO()
                    image.save(buffered, format="JPEG")
                    img_str = base64.b64encode(buffered.getvalue()).decode("utf-8")

                    image_b64 = (f"data:image/jpeg;base64,{img_str}")
                    images_b64_list.append(image_b64)

            images = pipe_refiner(prompt=prompt, negative_prompt=negative, image=images, num_inference_steps=steps, strength=refiner_strength).images

            gc.collect()
            torch.cuda.empty_cache()

        # Create the outputs folder if it doesn't exist

        for i, image in enumerate(images):
            buffered = BytesIO()
            image.save(buffered, format="JPEG")
            img_str = base64.b64encode(buffered.getvalue()).decode("utf-8")
            timestamp = datetime.now().strftime("%Y%m%d%H%M%S")
            image_b64 = (f"data:image/jpeg;base64,{img_str}")
            images_b64_list.append(image_b64)
            # Save the image as PNG with unique timestamp
            filename = f"stable-diffusion-xl-demo/outputs/generated_image_{timestamp}_{i}.png"
            image.save(filename, format="PNG")

    return images_b64_list

sayakpaul commented 1 year ago

You can do:

pipeline = DiffusionPipeline.from_pretrained(ckpt_id, use_safetensors=True, variant="fp16")

to directly load the safetensors and fp16 variants of the checkpoints.

FurkanGozukara commented 1 year ago

You can do:
pipeline = DiffusionPipeline.from_pretrained(ckpt_id, use_safetensors=True, variant="fp16")
to directly load the safetensors and fp16 variants of the checkpoints.

thank you. how can i add kohya script trained LoRA safetensors file to this pipeline?

sayakpaul commented 1 year ago

Now, that is a bit deviating from the original issue you posted. But we have a document here: https://huggingface.co/docs/diffusers/main/en/training/lora#supporting-a1111-themed-lora-checkpoints-from-diffusers.

We have ongoing threads on Kohya:

So, to centralize the discussions there I am going to close this thread assuming https://github.com/huggingface/diffusers/issues/4029#issuecomment-1630618340 solved your initial query. If not, please feel free reopen.

JosephCatrambone commented 1 year ago

Please forgive this comment on a closed ticket, but this may be helpful to others who stumbled upon this issue:

For others who got here via Google and are trying to load a safetensors file (like one downloaded from a website that aggregates models), please try this command: pipe = StableDiffusionXLPipeline.from_single_file("/home/you/path/etc/my_sdxl_model_from_civitai.safetensors").

If one tries to load a standalone safetensors file with DiffusionPipeline.from_pretrained, it will show HFValidationError: Repo id must be in the form 'repo_name' or 'namespace/repo_name': '/home/foo/bar.safetensors'. Use `repo_type` argument if needed..

zengjie617789 commented 10 months ago

@JosephCatrambone What about if use StableDiffusionXLPipeline.from_single_file("/home/you/path/etc/my_sdxl_model_from_civitai.safetensors"). and it download text_encoder exactly？ Futhermore， i found the function of from_single_file hard code the path of text_encoder instead of loading it localy.

JosephCatrambone commented 9 months ago

@zengjie617789

@JosephCatrambone What about if use StableDiffusionXLPipeline.from_single_file("/home/you/path/etc/my_sdxl_model_from_civitai.safetensors"). and it download text_encoder exactly？

If the safetensors requires a text_encoder then it will still download. There is a flag to disable this if your system cannot (or should not) connect to the internet while deployed. https://huggingface.co/docs/diffusers/v0.24.0/en/api/pipelines/overview#diffusers.DiffusionPipeline.from_pretrained.local_files_only

Futhermore， i found the function of from_single_file hard code the path of text_encoder instead of loading it localy.

😖 If I am understanding, from_single_file is hard-coding the text_encoder? That is not good. You may be able to load text_encoder separately with https://huggingface.co/docs/diffusers/v0.24.0/en/api/loaders/single_file#diffusers.loaders.FromSingleFileMixin.from_single_file.text_encoder.

my_text_encoder = load_my_text_encoder_here(...)
diffuser = StableDiffusionXLPipeline.from_single_file(
  "/home/yuo/path/etc/my_sdxl_model.safetensors", 
  use_safetensors=True, 
  text_encoder=my_text_encoder
)

中文：

如果 local_file_only == False 则 from_single_file 将下载 text_encoder。(如果 text_encoder 不存在。)

如果 model 硬编码 text_encoder，则可以尝试 StableDiffusionXLPipeline.from_single_file(..., text_encoder=...)。

对不起。我的中文不好。:')

huggingface / diffusers

How can I make diffuser pipeline to use .safetensors file for SDXL? #4029

3725