iree-org / iree

A retargetable MLIR-based machine learning compiler and runtime toolkit.
http://iree.dev/
Apache License 2.0
2.59k stars 581 forks source link

Add Stable Diffusion Unet/clip/vae compile test to presubmits #12635

Closed powderluv closed 1 month ago

powderluv commented 1 year ago

Request description

This is to prevent issues like https://github.com/openxla/iree/issues/12634. It would be good to compile for CUDA, vulkan rdna2, vulkan rdna3 (wmma). Ideally we can run a simple run-module / benchmark-module for those three on an NVidia gpu.

What component(s) does this issue relate to?

Other

Additional context

No response

mariecwhite commented 1 year ago

What config and compiler flags should be used for each model? There looks to be several available in https://github.com/nod-ai/SHARK/tree/main/apps/stable_diffusion/src/utils/resources. I currently have it defaulting to "CompVis/stable-diffusion-v1-4" with no additional compiler flags.

mariecwhite commented 1 year ago

I'm also running into this error for VAE:

Failed to import model SD_VAE_MODEL. Exception: Lowering Torch Backend IR -> Linalg-on-Tensors Backend IR failed with the following diagnostics:
error: failed to legalize operation 'torch.aten.convolution' that was explicitly marked illegal
note: see current operation: %173 = "torch.aten.convolution"(%arg0, %157, %156, %171, %172, %171, %2, %172, %17) : (!torch.vtensor<[1,1,4,64,64],f32>, !torch.vtensor<[4,4,1,1],f32>, !torch.vtensor<[4],f32>, !torch.list<int>, !torch.list<int>, !torch.list<int>, !torch.bool, !torch.list<int>, !torch.int) -> !torch.vtensor<[1,4,4,64,?],f32>
mariecwhite commented 1 year ago

Another question: Is the sequence length for ClipTextModel fixed or variable? If variable, what is the typical sequence length that we should set for the regression?

powderluv commented 1 year ago

We use 64 and 77 for clip.

@monorimet can you please share the compile flags for vulkan rdna2 / rnda3 and cuda for the SD models ?

aaron-schneider commented 1 year ago

Removing "awaiting-triage" since this has both priority and assignee.

monorimet commented 1 year ago

Here are the commands that have worked for me and should match what the SHARK SD apps do during compilation. UNet ( rdna2 ):

iree-compile --iree-input-type=none --iree-hal-target-backends=vulkan --iree-stream-resource-index-bits=64 --iree-vm-target-index-bits=64 --iree-util-zero-fill-elided-attrs -iree-vulkan-target-triple=rdna2-unknown-linux --iree-preprocessing-pass-pipeline='builtin.module(func.func(iree-flow-detach-elementwise-from-named-ops,iree-preprocessing-convert-conv2d-to-img2col,iree-flow-convert-1x1-filter-conv2d-to-matmul,iree-preprocessing-pad-linalg-ops{pad-size=32}))' unet.mlir -o unet.vmfb

VAE ( rdna2 ):

iree-compile --iree-input-type=none --iree-hal-target-backends=vulkan --iree-stream-resource-index-bits=64 --iree-vm-target-index-bits=64 --iree-util-zero-fill-elided-attrs -iree-vulkan-target-triple=rdna2-unknown-linux --iree-preprocessing-pass-pipeline='builtin.module(func.func(iree-flow-detach-elementwise-from-named-ops,iree-preprocessing-convert-conv2d-to-img2col,iree-flow-convert-1x1-filter-conv2d-to-matmul,iree-preprocessing-pad-linalg-ops{pad-size=32},iree-linalg-ext-convert-conv2d-to-winograd))' vae.mlir -o vae.vmfb

Clip ( rdna2 ):

iree-compile --iree-input-type=none --iree-hal-target-backends=vulkan --iree-stream-resource-index-bits=64 --iree-vm-target-index-bits=64 --iree-util-zero-fill-elided-attrs -iree-vulkan-target-triple=rdna2-unknown-linux --iree-preprocessing-pass-pipeline='builtin.module(func.func(iree-preprocessing-pad-linalg-ops{pad-size=32}))' clip.mlir -o clip.vmfb

UNet ( rdna3 ):

iree-compile --iree-input-type=none --iree-hal-target-backends=vulkan --iree-stream-resource-index-bits=64 --iree-vm-target-index-bits=64 --iree-util-zero-fill-elided-attrs -iree-vulkan-target-triple=rdna3-7900-linux --iree-preprocessing-pass-pipeline='builtin.module(func.func(iree-flow-detach-elementwise-from-named-ops,iree-preprocessing-convert-conv2d-to-img2col,iree-flow-convert-1x1-filter-conv2d-to-matmul,iree-preprocessing-pad-linalg-ops{pad-size=32}))' unet.mlir -o unet.vmfb

VAE ( rdna3 ):

iree-compile --iree-input-type=none --iree-hal-target-backends=vulkan --iree-stream-resource-index-bits=64 --iree-vm-target-index-bits=64 --iree-util-zero-fill-elided-attrs -iree-vulkan-target-triple=rdna3-7900-linux --iree-preprocessing-pass-pipeline='builtin.module(func.func(iree-flow-detach-elementwise-from-named-ops,iree-preprocessing-convert-conv2d-to-img2col,iree-flow-convert-1x1-filter-conv2d-to-matmul,iree-preprocessing-pad-linalg-ops{pad-size=32},iree-linalg-ext-convert-conv2d-to-winograd))' vae.mlir -o vae.vmfb

Clip ( rdna3 ):

iree-compile --iree-input-type=none --iree-hal-target-backends=vulkan --iree-stream-resource-index-bits=64 --iree-vm-target-index-bits=64 --iree-util-zero-fill-elided-attrs -iree-vulkan-target-triple=rdna3-7900-linux --iree-preprocessing-pass-pipeline='builtin.module(func.func(iree-flow-detach-elementwise-from-named-ops,iree-flow-convert-1x1-filter-conv2d-to-matmul,iree-preprocessing-pad-linalg-ops{pad-size=32}))' clip.mlir -o clip.vmfb

UNet ( cuda ):

iree-compile --iree-input-type=none --iree-hal-target-backends=cuda --iree-stream-resource-index-bits=64 --iree-vm-target-index-bits=64 --iree-util-zero-fill-elided-attrs unet.mlir -o unet.vmfb

VAE ( cuda ):

iree-compile --iree-input-type=none --iree-hal-target-backends=cuda --iree-stream-resource-index-bits=64 --iree-vm-target-index-bits=64 --iree-util-zero-fill-elided-attrs vae.mlir -o vae.vmfb

Clip ( cuda ):

iree-compile --iree-input-type=none --iree-hal-target-backends=cuda --iree-stream-resource-index-bits=64 --iree-vm-target-index-bits=64 --iree-util-zero-fill-elided-attrs  clip.mlir -o clip.vmfb

Let me know if these work for you.

allieculp commented 1 year ago

@mariecwhite Can you update this one?

mariecwhite commented 1 year ago

Sorry, I didn't see Ean's comment. Clip and UNet are in CI and running for CUDA. VAE was working but now regressed. I also need to add support for running the models with Vulkan/Spirv.

mariecwhite commented 1 year ago

Thanks for adding Vulkan benchmarks @antiagainst ! For VAE, there seems to be an issue with the model: https://github.com/openxla/iree/issues/12771

GMNGeoffrey commented 1 year ago

Unassigning myself from issues that I'm not actively working on

mariecwhite commented 1 year ago

Assigning this to @antiagainst since he picked up this task.

antiagainst commented 1 year ago

The only missing one is VAE I think, which is pending in #12767. Any chance we move that forward, @mariecwhite?

antiagainst commented 1 month ago

obsolete