Open CharlieL7 opened 7 months ago
Instead of matching the slice
we can try to apply this rewrite before horizontal fusion.
Also, we could start but just writing a matcher for A * B * C
that rewrites it to A * (B * C)
. If this could be applied before horizontal fusion than this would work.
From the 22 Feb 2024 performance model review of Distilgpt2:
There are several gemms that are applied together(this is the tailend of attention):
We have something like
X * (Y*A + b) * C
(where is matmul) if we get rid of the slice(which is undoing some of the horizontal fusions). So we could possibly rewrite it as `X (YAC + bC), which after const folding we would just have
X (Y*A' + b')` which gets rid of the gemm completely.This case can be generalized to also not have the
slice
operator, simplifying the manipulations needed.Deliverables: