SD3.5 large and large turbo share the same model architecture, with the main difference being the use of QK RMS normalization, which SD3 medium doesn't.
This PR implements QR RMS normalization in MMDiT. Additionally, the inference pipeline differs in turbo version, as it doesn't require classifier-free guidance, so I have made some changes to the APIs.
The presets have been uploaded to kaggle/kerashub path. Let me know if any changes are needed.
@divyashreepathihalli @mattdangerw
Prompt
Large
Large Turbo
"A cat holding a sign that says hello world"
Parameters of generate (ref: huggingface/diffusers)
Large: num_steps=40, guidance_scale=4.5
Large turbo: num_steps=4, guidance_scale=None (much faster that large version)
SD3.5 large and large turbo share the same model architecture, with the main difference being the use of QK RMS normalization, which SD3 medium doesn't.
This PR implements QR RMS normalization in MMDiT. Additionally, the inference pipeline differs in turbo version, as it doesn't require classifier-free guidance, so I have made some changes to the APIs.
The presets have been uploaded to kaggle/kerashub path. Let me know if any changes are needed.
@divyashreepathihalli @mattdangerw
Parameters of
generate
(ref:huggingface/diffusers
)num_steps=40
,guidance_scale=4.5
num_steps=4
,guidance_scale=None
(much faster that large version)