Diffusers - Githubissues

lllyasviel / sd-forge-layerdiffuse

[WIP] Layer Diffusion for WebUI (via Forge)

Apache License 2.0

3.89k stars 334 forks source link

Diffusers #53

Open pin-lpt-heatherwick opened 8 months ago

pin-lpt-heatherwick commented 8 months ago

Will there be a version for diffusers?

lachlan-nicholson commented 8 months ago

The TransparentVAEDecoder is compatible with Diffusers, though the LoRA models may take some effort in translating keys.

lachlan-nicholson commented 8 months ago

@arpitsahni04 Specifically for the core LoRA model (not the others which change layer sizes):

Diffusers have some support for converting SGM/Automatic/Kohya format loras to diffusers format. This lora seemed to be a slightly different format which I handled with a few string replacements.

Specifically:

        k = k.replace("diffusion_model", "lora_unet")
        k = k.replace(".", "_")
        k = k.replace("_weight::lora::0", ".lora_up.weight")
        k = k.replace("_weight::lora::1", ".lora_down.weight")

pin-lpt-heatherwick commented 8 months ago

@lachlan-nicholson I don't think it's going to work. For example, despite having similar naming conventions, the first middle block layer has the name and shape of: diffusion_model.middle_block.1.transformer_blocks.0.attn1.to_k.weight::lora::0 torch.Size([1280, 256])

and the expected name and size of: unet.mid_block.attentions.0.transformer_blocks.0.attn1.processor.to_k_lora.down.weight torch.Size([4, 1280])

am I mistaken or is there some way to convert them?

lachlan-nicholson commented 8 months ago

@pin-lpt-heatherwick I can confirm it does work.

Those shapes you shared are okay - the first is an up weight and the second is a down weight, so there shapes are transposed. Lora-down-weights have shape [rank, dim] in this case [r, 1280]. As long as the up and down weights have the same rank, there shouldn't be any concerns.

LiuShiyu95 commented 8 months ago

@lachlan-nicholson Hello, sorry to bother you. May I ask if you know what the preprocessing of the input and output data in the TransparentVAEDecoder looks like, and whether the range of latents and images is -1 to 1 or 0-1 or something like that. I am not good at using WebUI

Victor-lol commented 8 months ago

@arpitsahni04 Specifically for the core LoRA model (not the others which change layer sizes):

Diffusers have some support for converting SGM/Automatic/Kohya format loras to diffusers format. This lora seemed to be a slightly different format which I handled with a few string replacements.

Specifically:
        k = k.replace("diffusion_model", "lora_unet")
        k = k.replace(".", "_")
        k = k.replace("_weight::lora::0", ".lora_up.weight")
        k = k.replace("_weight::lora::1", ".lora_down.weight")

Do you mind sharing your code? I just wonder how you solve this issue. Thanks!

nighting0le01 commented 7 months ago

@arpitsahni04 Specifically for the core LoRA model (not the others which change layer sizes):

Diffusers have some support for converting SGM/Automatic/Kohya format loras to diffusers format. This lora seemed to be a slightly different format which I handled with a few string replacements.

Specifically:
        k = k.replace("diffusion_model", "lora_unet")
        k = k.replace(".", "_")
        k = k.replace("_weight::lora::0", ".lora_up.weight")
        k = k.replace("_weight::lora::1", ".lora_down.weight")

Hey @lachlan-nicholson did you try applying the lora weights on the SD1.5 variant as well(layer_sd15_transparent_attn.safetensors)?