This doesn't enable vectorization on elementwise and just with the temporary buffer -> amdaie.buffer.
I noticed my last local state was enabling vectorization on elementwise because I was trying to test Relu too once "Matmul + truncf" worked hence forgot to disable and went on to add changes to lit tests.
Adding this as WIP because I'll immediately try to add fixes for enabling vectorization on elementwise/Relu - so can just reuse the lit test changes that I added previously.
This doesn't enable vectorization on elementwise and just with the temporary buffer -> amdaie.buffer.
I noticed my last local state was enabling vectorization on elementwise because I was trying to test Relu too once "Matmul + truncf" worked hence forgot to disable and went on to add changes to lit tests.
Adding this as WIP because I'll immediately try to add fixes for enabling vectorization on elementwise/Relu - so can just reuse the lit test changes that I added previously.