iree-org / iree

A retargetable MLIR-based machine learning compiler and runtime toolkit.
http://iree.dev/
Apache License 2.0
2.48k stars 552 forks source link

[EPIC][GPU][DT] Bring up GPU data-tiling with reasonable performance #17181

Open hanhanW opened 2 months ago

hanhanW commented 2 months ago

Overview

This is the umbrella issue that collects tasks toward phase 1. In the phase 1, we aim to provide a functional data-tiling GPU path with reasonable performance. In this phase, we don't chase for optimal performance. Instead, we want to enable the path for all e2e tracking models.

The reasonable performance means that we should be able to vectorize, and apply vector distribution on data-tiling ops (i.e., pack/unpack/mmt4d-like ops).

ETA: ~1 month

Milestone 1 - enable data-tiling in tests/e2e/matmul test suite

The scope is to compile and execute a linalg.matmul; enable e2e tests. Additionally, we want to extract few matmul ops (potentially with dequant ops) from sdxl and lamma models, and focus on them. To achieve the milestone, the major tasks are:

@bjacob let's share the above tasks between you and me. I'll convert the tasks into issues soon.

Milestone 2 - enable at least one e2e model on benchmark CI

This milestone mainly focus on fusion codegen, which allows us to compile and execute ML workloads. For now, the target is sdxl and sd3.

Major tasks:

Assign @MaheshRavishankar to be contact point for milestone 2, because he is tracking the TilingInterface support. I can jump into some tasks when there is a need.

hanhanW commented 2 months ago

(cc @qedawkins @antiagainst @powderluv @stellaraccident )

hanhanW commented 2 months ago

@bjacob I volunteer you to be assigned for https://github.com/iree-org/iree/issues/17185 and https://github.com/iree-org/iree/issues/17188 for now, but feel free to pick whatever tasks that you're interested in. Also, feel free to update the issues because I could miss something. I'll pick up the pack/unpack codegen because I had some patches long long long time ago; I'll try to revamp them.