huggingface / candle

Minimalist ML framework for Rust
Apache License 2.0
15.19k stars 888 forks source link

How to load multiple safetensors with json format #2413

Open oovm opened 1 month ago

oovm commented 1 month ago

For such a task:

https://huggingface.co/black-forest-labs/FLUX.1-dev/tree/main/transformer

how should safetensors be loaded?

LaurentMazare commented 1 month ago

All the examples in this repo use hub_load_safetensors to load weights from a json description like this. In the case of flux, we actually use the main safetensors file at the root of the repo that combines all these together.

oovm commented 1 month ago

Thanks, this works great with mmap.


By the way, is there a plan to load and save a single flux model file? The model I downloaded from civit has this structure:

[
    "model.diffusion_model.double_blocks.0.img_attn.norm.key_norm.scale",
    "model.diffusion_model.single_blocks.9.norm.query_norm.scale",
    ...,
    "model.diffusion_model.final_layer.adaLN_modulation.1.bias",
    "model.diffusion_model.guidance_in.in_layer.bias",
    "model.diffusion_model.img_in.bias",
    ...
    "text_encoders.clip_l.logit_scale",
    "text_encoders.t5xxl.logit_scale",
    ...
    "vae.encoder.conv_out.weight",
]

And the python version also supports this:

https://github.com/huggingface/diffusers/pull/9083

LaurentMazare commented 1 month ago

This shouldn't be too hard to support as the weight structure seems mostly identical, my guess is that it would just require tweaking the example code and not require much/any change on the actual model side. Could you provide some pointers to models that use this layout?

oovm commented 1 month ago

flux-clip.json

scheduler/                # not mapping
text_encoder/             # -> text_encoders.clip_l.transformer.{KEY}
?                         # -> text_encoders.clip_l.logit_scale
text_encoder_2/           # -> text_encoders.t5xxl.transformer.{KEY}
?                         # -> text_encoders.t5xxl.logit_scale
tokenizer/                # no weights
tokenizer_2/              # no weights
transformer/              # not mapping?
vae/                      # not mapping?
ae.safetensors            # -> vae.{KEY}
flux1-dev.safetensors     # -> model.diffusion_model.{KEY}

['$.logit_scale', '$.transformer.text_projection.weight'] seems only used in ComfyUI.

ComfyUI seems to use single files instead of vae/ and transformer/.

LaurentMazare commented 1 month ago

Sorry I meant would you have a link to an actual safetensors file that has this layout so that I can give it a try?

oovm commented 3 weeks ago

https://huggingface.co/Comfy-Org/flux1-dev/tree/main?show_file_info=flux1-dev-fp8.safetensors

These formats can't actually be loaded because the dtype is not supported.

I can't find an unquantized version