Very Slow first inference with diffusers 0.27.X

nesscube commented 4 months ago

Describe the bug

Hello diffusers team ! I face an annoying issue since I upgraded the diffusers version to 0.27.X The first call (and only the first) of pipeline(...) takes now a lot of time before to start inference (like a minute) Moreover the call of compel(prompts) takes 30 seconds versus instant in 0.26.X

Thos slow down seems to happen only :

On 0.27.X version of diffusers
for XL models
if I load pipeline with from_single_file with a safetensors file
I run my inference in a Docker Container My dockerfile starts with : FROM python:3.10.6-slim-buster

Unfortunately I need all of these for my project ..

thanks a lot for help !

Reproduction

from compel import Compel from diffusers import ( StableDiffusionXLPipeline )

pipeline = StableDiffusionXLPipeline.from_single_file( model_path, torch_dtype=torch.float16, local_files_only=True, use_safetensors=True, add_watermarker=False, original_config_file=model_config, vae=AutoencoderKL.from_pretrained(model_path_vae, torch_dtype=torch.float16) ) pipeline.enable_model_cpu_offload()

prompt_embeds, pooled_prompt_embeds = compel(prompts) negative_prompt_embeds, negative_pooled_prompt_embeds = compel(negative_prompts)

result = pipeline( prompt_embeds=prompt_embeds, pooled_prompt_embeds=pooled_prompt_embeds, negative_prompt_embeds=negative_prompt_embeds, negative_pooled_prompt_embeds=negative_pooled_prompt_embeds , width=width, height=height, num_inference_steps=num_inference_steps, guidance_scale=6, num_images_per_prompt=1, generator=torch.Generator(device='cuda').manual_seed(seed) )

Logs

No response

System Info

diffusers version: 0.27.2
Platform: Linux-5.15.90.1-microsoft-standard-WSL2-x86_64-with-glibc2.36
Python version: 3.10.14
PyTorch version (GPU?): 2.1.2+cu121 (True)
Huggingface_hub version: 0.22.2
Transformers version: 4.36.2
Accelerate version: 0.26.1
xFormers version: 0.0.23.post1
Using GPU in script?: no
Using distributed or parallel set-up in script?: no

Who can help?

@yiyixuxu @sayakpaul @DN6

sayakpaul commented 4 months ago

Could you provide a reproducible snippet without Compel that demonstrates the inference slow down?

sayakpaul commented 4 months ago

Also, FWIW, we run benchmarking tests regularly and do automated reporting: https://huggingface.co/datasets/diffusers/benchmarks/tree/main. As we can see, there's no weird latency changes in the most commonly used pipelines.

lerignoux commented 4 months ago

Hello

@nesscube You were running this in WSL or Windows desktop right ? I managed to reproduce on my side but it seems to be linked to the model loading.

Reproduction:

# In Docker Desktop
docker run -it -v <windows_folder_path_with_model>/:/models/ python:3.10-slim bash
cd /models
pip install diffusers==0.27.2 torch transformers accelerate

Then run:


from datetime import datetime
import torch
from diffusers import StableDiffusionXLPipeline

model_path = "albedobond/albedobase-xl-v2.1.safetensors"
prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k"
print(f"Loading pipeline: {datetime.utcnow()}"); pipeline = StableDiffusionXLPipeline.from_single_file(model_path, torch_dtype=torch.float16, local_files_only=True); print(f"Pipeline Loaded: {datetime.utcnow()}")
pipeline.enable_model_cpu_offload()

print(f"Generating: {datetime.utcnow()}"); image = pipeline(prompt=prompt).images[0]; print(f"Generated: {datetime.utcnow()}")

If I run the same thing with diffusers==0.26.3 there is no problem even in WSL
If I run the same thing but ensure the folder containing the models is in WSL not on windows the problem disappear.

When mounting the model from a windows folder, I notice the from_single_file method is much faster, it returns nearly immediately. But then generation takes ages. I guess the model is just not in Ram so it runs from disk.|

@sayakpaul Do you know if there was any change with the model loading process in 0.27 ?

yiyixuxu commented 4 months ago

We had this PR https://github.com/huggingface/diffusers/pull/6994 - is this related?

lerignoux commented 4 months ago

We had this PR #6994 - is this related?

Yes nice one, Bisect confirmed your info. issues is brought by this commit

Tried to have a look today, but will need more time to see the actual issue deeper. Do you know if anyone familiar with it could help ?

github-actions[bot] commented 1 week ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

huggingface / diffusers