alibaba / BladeDISC

BladeDISC is an end-to-end DynamIc Shape Compiler project for machine learning workloads.
Apache License 2.0
815 stars 160 forks source link

Introduce MLIR transform dialect to BladeDISC #787

Open wyzero opened 1 year ago

wyzero commented 1 year ago

We'll start to explore using MLIR transform dialect to do codegen for (fused) compute-intensive pattern. The initial target is to support gemm codegen on ARM platform to address the dynamic shape problem of Arm Compute Library.

The initial plan is:

wyzero commented 1 year ago

e2e model test on: Bert Base (TF) and Albert (PyTorch), on g6r, using single thread. Note that we only have one default schedule for all shape and the schedule is known to be less performant when n or k is large, thus the initial performance is supposed to be improved when we support schedule selection logic.

Bert Base (TF)

input TF 2.8(s) DISC-ACL(s) DISC-Transform(s) speedup (DISC-transform / DISC-ACL)
(1, 128) 0.742 0.638 0.656 97.3%
(2, 128) 1.41 1.24 1.27 97.6%
(4, 128) 2.85 2.36 2.55 92.5%
(8, 128) 5.84 4.68 5.07 92.3%
(16, 128) 11.9 9.55 10.2 93.6%

Albert (PyTorch)

input TorchScript OnnxRuntime DISC-ACL DISC-Transform
(2, 12) 0.197 0.140 0.117 0.139
wyzero commented 1 year ago

some sharing doc:

https://bladedisc.oss-cn-hangzhou.aliyuncs.com/docs/transform-dialect-based-codegen-in-bladedisc.pdf