SDXL IRs and Scripts

SDXL end-to-end benchmarking

Checkout and compile IREE with release build and export PATH=/path/to/iree/build/release/tools:$PATH
Compile the full SDXL model: ./compile-txt2img.sh gfx942 (where gfx942 is the target for MI300X)
Run the benchmark: ./benchmark-txt2img.sh N /path/to/weights/irpa (where N is the GPU index)

Model IRs and weights

[!CAUTION] IRs in the following table might be stale. Use the ones in the base_ir/ directory instead.

[!NOTE] SDXL-turbo is only different from SDXL in its usage and training/weights. The model architecture (and therefore the weights-stripped MLIR) are equivalent.

Variant	Submodel	MLIR (No Weights) (Config A)	safetensors	Splat IRPA	MLIR (No Weights) (Config B)
SDXL1.0 1024x1024 (f16, BS1, len64)
	UNet + attn	Torch - Linalg	-	-	Azure
	UNet + PNDMScheduler	Azure
	Clip1	Azure	-	-
	Clip2	Azure	-	-
	VAE decode + attn	Azure	-	=	Azure
	VAE encode + attn	[GCloud][sdxl-1-1024x1024-f16-stripped-weight-vae-encode]	Same as decode	-	-
SDXL1.0 1024x1024 (f32, BS1, len64)
	UNet + attn	Azure	Azure	Azure	Azure
	Clip1	Azure	Azure	Azure	-
	Clip2	Azure	Azure	Azure	-
	VAE decode + attn	Azure	Azure	Azure	Azure
SDXL compiled pipeline IRPAs (f16)
	UNet	scheduled_unet_f16.irpa
	Prompt Encoder (CLIP1 + CLIP2)	prompt_encoder_f16.irpa
	VAE	vae_decode_f16.irpa

nod-ai / sdxl-scripts

readme

SDXL IRs and Scripts

SDXL end-to-end benchmarking

Model IRs and weights