export PATH=/path/to/iree/build/release/tools:$PATH
./compile-txt2img.sh gfx942
(where gfx942
is the target for MI300X)./benchmark-txt2img.sh N /path/to/weights/irpa
(where N
is the GPU index)[!CAUTION] IRs in the following table might be stale. Use the ones in the
base_ir/
directory instead.[!NOTE] SDXL-turbo is only different from SDXL in its usage and training/weights. The model architecture (and therefore the weights-stripped MLIR) are equivalent.
Variant | Submodel | MLIR (No Weights) (Config A) | safetensors | Splat IRPA | MLIR (No Weights) (Config B) |
---|---|---|---|---|---|
SDXL1.0 1024x1024 (f16, BS1, len64) | |||||
UNet + attn | Torch - Linalg | - | - | Azure | |
UNet + PNDMScheduler | Azure | ||||
Clip1 | Azure | - | - | ||
Clip2 | Azure | - | - | ||
VAE decode + attn | Azure | - | = | Azure | |
VAE encode + attn | [GCloud][sdxl-1-1024x1024-f16-stripped-weight-vae-encode] | Same as decode | - | - | |
SDXL1.0 1024x1024 (f32, BS1, len64) | |||||
UNet + attn | Azure | Azure | Azure | Azure | |
Clip1 | Azure | Azure | Azure | - | |
Clip2 | Azure | Azure | Azure | - | |
VAE decode + attn | Azure | Azure | Azure | Azure | |
SDXL compiled pipeline IRPAs (f16) | |||||
UNet | scheduled_unet_f16.irpa | ||||
Prompt Encoder (CLIP1 + CLIP2) | prompt_encoder_f16.irpa | ||||
VAE | vae_decode_f16.irpa |