With very large open models like SD3 medium and Flux.1 gaining popularity It's becoming comon to provide the diffusion model (unet/diffusion transformer) part of the model and the text encoders separately, since the text encoders can often be reused across different models, to save internet bandwidth and storage space.
I think it would be cool to support these split models here. It could also be a way to use different quantization for different parts of the model.
VAEs can already be provided separately, and this is the same kind of thing.
With very large open models like SD3 medium and Flux.1 gaining popularity It's becoming comon to provide the diffusion model (unet/diffusion transformer) part of the model and the text encoders separately, since the text encoders can often be reused across different models, to save internet bandwidth and storage space.
I think it would be cool to support these split models here. It could also be a way to use different quantization for different parts of the model. VAEs can already be provided separately, and this is the same kind of thing.