Closed newling closed 1 month ago
My name is Max and I support this message (thanks for continuing to push on this because I concur aie2xclbin shouldn't have passes in it at all).
Turns out that this prepones a few other bugs that wasn't captured by the lit tests.
Matmul + Truncf PR tries to enable vectorization for all element-wise operations - no issue was caught earlier.
But now, it causes an issue because VectorToAIEVecConversions.cpp
places lot of addDynamicallyLegalOp
constraints on various arith ops. Specifically an error is thrown for arith.addf
op for this lit test.
I guess it'd be nice if we could bridge the gap between AMDAIEVectorization.cpp
and the expectations laid out by VectorToAIEVecConversions.cpp
. One way is clearly to enforce vectorization based on the expectations of VectorToAIEVecConversions.cpp
.
For now, to not at least get in the first step toward supporting Matmul + Truncf in, I've added a stop-gap solution.
Please let me know your thoughts on the overall plan.
Turns out that this prepones a few other bugs that wasn't captured by the lit tests.
Matmul + Truncf PR tries to enable vectorization for all element-wise operations - no issue was caught earlier. But now, it causes an issue because
VectorToAIEVecConversions.cpp
places lot ofaddDynamicallyLegalOp
constraints on various arith ops. Specifically an error is thrown forarith.addf
op for this lit test.I guess it'd be nice if we could bridge the gap between
AMDAIEVectorization.cpp
and the expectations laid out byVectorToAIEVecConversions.cpp
. One way is clearly to enforce vectorization based on the expectations ofVectorToAIEVecConversions.cpp
.For now, to not at least get in the first step toward supporting Matmul + Truncf in, I've added a stop-gap solution.
Please let me know your thoughts on the overall plan.
I like the plan. I guess in this lit test aievec shouldn't be interested in f32 addf.
Yes -- it'd be nice to consolidate AMDAIEVectorization with the later aievec passes. I looked into merging the 2 a few weeks ago, but the main problem I faced was the AMDAIEVectorization runs before bufferization, aievec runs after bufferization.
This PR puts the aievec passes in the main 'IREE' pipeline, rather than running them in aie2xclbin.
I found the trickiest was figuring out why it didn't "just work" without adding Affine and AIEVec as legal dialects in the pattern target in AMDAIECoreToStandard, So I'd like to refactor that pass soon -- there are too many dialects listed there. I'd also like to throw the final lowering to LLVM over the wall from aie2xclbin too, but that was causing other issues which I'd rather address in another PR.