argmaxinc / DiffusionKit

On-device Inference of Diffusion Models for Apple Silicon
MIT License
446 stars 22 forks source link

Publish DiffusionKit format checkpoints for FLUX #19

Closed atiorh closed 4 weeks ago

atiorh commented 1 month ago

DiffusionKit currently patches the original FLUX checkpoint here each time the weights are restored. We should publish pre-patched checkpoints to avoid runtime reformatting.

atiorh commented 1 month ago

@arda-argmax Feel free to close this issue by posting your latency stats.

arda-argmax commented 4 weeks ago
INFO:diffusionkit.mlx.model_io:Time to create model: 0.008893251419067383 s
INFO:diffusionkit.mlx.model_io:Time to load weights: 0.3058888912200928 s
INFO:diffusionkit.mlx.model_io:Time to adjust weights: 0.007555961608886719 s

Patching the checkpoint takes around 10 ms, which does not create significant latency.