Closed newling closed 3 days ago
It should be as robust as the codegen itself? I've tested it locally up to like 1024x1024. One thing to be careful about is stale build artifacts because of the labyrythine method used to generate the vmfb. So try clearing the build dir that contains xrt_lite_executables_c.h
.
I'm trying to adapt/use it to make a standalone matmul benchmarking test.
No advice here but why doesn't iree-e2e-benchmark-module suffice?
It should be as robust as the codegen itself? I've tested it locally up to like 1024x1024. One thing to be careful about is stale build artifacts because of the labyrythine method used to generate the vmfb. So try clearing the build dir that contain
I'm fairly sure it's not that, I give the mlir func a new name before runs. But I'll double check.
No advice here but why doesn't iree-e2e-benchmark-module suffice?
Does this not include all the extra configuration stuff which swamps the time to actually run the kernel?
Does this not include all the extra configuration stuff which swamps the time to actually run the kernel?
I don't know exactly what's currently happening but I know there are ways to make sure that overhead isn't counted.
@makslevental there does appear to be an issue, it is not just a caching mirage. The only change I've made since this PR passed is to change m=n=128 to m=n=512, and now (I expect) it fails.
Okay lemme take a look
this is a codegen issue: see https://github.com/nod-ai/iree-amd-aie/pull/897
Motivation: let's test objectFifo as that is now the pipeline we're supporting long-term.