chengzeyi / stable-fast

Best inference performance optimization framework for HuggingFace Diffusers on NVIDIA GPUs.
MIT License
1.06k stars 60 forks source link

SDXL Swap lora Issue #75

Closed SuperSecureHuman closed 7 months ago

SuperSecureHuman commented 7 months ago

Hey

I am trying to swap lora of the compiled model with the sample code given in readme. I get this error

image

When I try to replace the weights myself, I am getting very bad outputs

My snippet of trying to swap weights

state_dict = pipe.unet.state_dict()
pipe.load_lora_weights("CiroN2022/toy-face", weight_name="toy_face_sdxl.safetensors", adapter_name="toy_face")
update_state_dict(state_dict, pipe.unet.state_dict())
pipe.unet.load_state_dict(state_dict, assign=True)

Then infer.

SuperSecureHuman commented 7 months ago

Here, expected style of the output

image

Sometimes, the output is same as the prev style

image

Sometimes, it just plain bad

image

chengzeyi commented 7 months ago

@SuperSecureHuman Try fusing lora before the compilation, and unfusing then refusing lora in each swap. This functionality is tricky and I hope you understand the basic concept of CUDA memory management and CUDA Graph.

SuperSecureHuman commented 7 months ago

Got it, trying now

chengzeyi commented 7 months ago

If you cannot get it work, try disabling CUDA Graph and increasing the batch size to increase GPU utilization.

chengzeyi commented 7 months ago

And a LoRA may include other parts besides UNet. In such case, even text encoder and vae also need handling, and perhaps including other parts.

SuperSecureHuman commented 7 months ago

I'll try with my usecases, and getback if any help is needed with those componenets.

Thanks!