`lora_scale` has no effect when loading with Flux

cshowley commented 2 hours ago

Describe the bug

According to loading loras for inference an argument cross_attention_kwargs={"scale": 0.5} can be added to a pipeline() call to vary the impact of a LORA on image generation. As the FluxPipeline class doesn't support this argument I followed the guide here to embed the text prompt with a LORA scaling parameter. However the image remained unchanged with a fixed seed+prompt and a variable lora_scale. I checked the embedding values for different values of lora_scale and saw they did not change either. Does Flux in diffusers not support LORA scaling or am I missing something?

Reproduction

from diffusers import FluxPipeline import torch from PIL import Image

model_path="black-forest-labs/FLUX.1-dev" lora_path="CiroN2022/toy-face" weight_name="toy_face_sdxl.safetensors" device = 'cuda' seed = torch.manual_seed(0)

pipeline = FluxPipeline.from_pretrained( model_path=model_path, torch_dtype=torch.bfloat16, use_safetensors=True, ).to(device)

pipeline.load_lora_weights( lora_path, weight_name=weight_name )

prompt = "toy_face of a hacker with a hoodie" lora_scale = 0.5 prompt_embeds, pooled_promptembeds, = pipeline.encode_prompt( prompt=prompt, prompt_2=None, lora_scale=lora_scale, )

image = pipeline( prompt_embeds=prompt_embeds, pooled_prompt_embeds=pooled_prompt_embeds, num_inference_steps=10, guidance_scale=5, generator=seed, ).images[0]

image.show()

Logs

No response

System Info

🤗 Diffusers version: 0.30.3
Platform: Linux-6.5.0-26-generic-x86_64-with-glibc2.31
Running on Google Colab?: No
Python version: 3.10.15
PyTorch version (GPU?): 2.3.1+cu121 (True)
Flax version (CPU?/GPU?/TPU?): not installed (NA)
Jax version: not installed
JaxLib version: not installed
Huggingface_hub version: 0.23.4
Transformers version: 4.44.0
Accelerate version: 0.33.0
PEFT version: 0.12.0
Bitsandbytes version: not installed
Safetensors version: 0.4.3
xFormers version: not installed
Accelerator: NVIDIA A100 80GB PCIe, 81920 MiB
Using GPU in script?: yes
Using distributed or parallel set-up in script?: no

Who can help?

No response

asomoza commented 2 hours ago

Hi, I never use that method, can you test with this?

pipeline.load_lora_weights(lora_path, weight_name=weight_name, adapter_name="toy")
pipe.set_adapters("toy", 0.5)

And yeah, the Flux pipeline doesn't have cross_attention_kwargs and you're using it directly when encoding the prompt, if the lora didn't train the text encoders (most don't), you won't see any difference.

cshowley commented 1 hour ago

Your suggestion pipe.set_adapters("toy", 0.5) is not showing any change unfortunately.

In this guide I see the following code block:

pipe = ... # create pipeline
pipe.load_lora_weights(..., adapter_name="my_adapter")
scales = {
    "text_encoder": 0.5,
    "text_encoder_2": 0.5,  # only usable if pipe has a 2nd text encoder
    "unet": {
        "down": 0.9,  # all transformers in the down-part will use scale 0.9
        # "mid"  # in this example "mid" is not given, therefore all transformers in the mid part will use the default scale 1.0
        "up": {
            "block_0": 0.6,  # all 3 transformers in the 0th block in the up-part will use scale 0.6
            "block_1": [0.4, 0.8, 1.0],  # the 3 transformers in the 1st block in the up-part will use scales 0.4, 0.8 and 1.0 respectively
        }
    }
}
pipe.set_adapters("my_adapter", scales)

which says to pass a dictionary in the the .set_adapters() call. If I pass 0.5 like you said does that apply that weighting to all elements of the LORA?

huggingface / diffusers