Stable Diffusion 3.5 Medium has been released on Oct 29, with modified archetecture named MMDiT-X. This PR adds support to the Stable Diffusion 3.5 Medium and MMDiT-X model.
and different block-joining (between x and context) is present here.
Note: A change has been made in sd3.5 after the release of Stable Diffusion 3.5 Medium on Oct 29 that fixes some bugs in the original reference design.
Implementation-wise,
a trait polymorphism is kept between the old and new JointBlock, but individual DiTBlock is re-implemented to avoid coupling. Ad-hoc adaptation to original DiTBlock has been attempted and dropped as it seems less sensible in terms of software engineering.
SD3.5 has the X-block in the first 12 layers out of total 24 layers of JointBlock changed to an extra attention attn2 side track (namely "Self Attention"). None of the context-blocks have this extra attention. So the MMDiTXJointBlock is set to use this specification without further generalization.
Stable Diffusion 3.5 Medium has been released on Oct 29, with modified archetecture named MMDiT-X. This PR adds support to the Stable Diffusion 3.5 Medium and MMDiT-X model.
Change is based on reference design sd3.5/mmdit-x.py in comparison with sd3-ref/mmdit.py, including
Note: A change has been made in sd3.5 after the release of Stable Diffusion 3.5 Medium on Oct 29 that fixes some bugs in the original reference design.
Implementation-wise,
attn2
side track (namely "Self Attention"). None of the context-blocks have this extra attention. So the MMDiTXJointBlock is set to use this specification without further generalization.References:
Sample image generated with Stable Diffusion 3.5 Medium: