nod-ai / iree-amd-aie

IREE plugin repository for the AMD AIE accelerator
Apache License 2.0
45 stars 23 forks source link

Hoist the bias allocation #444

Closed newling closed 2 weeks ago

newling commented 2 weeks ago

The elementwise input is promoted to local memory after the pass to hoist statically bound allocations. It is the only tensor which is promoted after the hoisting pass. This PR is just a tidy-up, so that all of the allocations are hoisted before bufferization.

yzhang93 commented 2 weeks ago

@yzhang93 yes, that's what I meant by my comment. It is the only one which isn't hoisted before the comprehensive bufferization pass, and it is neater if it is hoisted too imo.

yzhang93 commented 2 weeks ago

@newling I think you edit my comment instead of replying it. My point is there is new allocation created during comprehensive-bufferization(https://github.com/nod-ai/iree-amd-aie/blob/main/compiler/plugins/target/AMD-AIE/iree-amd-aie/Transforms/Passes.cpp#L76), so in any case we need to do hoisting after comprehensive-bufferization. https://github.com/nod-ai/iree-amd-aie/blob/main/compiler/plugins/target/AMD-AIE/iree-amd-aie/Transforms/Passes.cpp#L294. What you added before bufferization is redundant.

newling commented 2 weeks ago

@newling I think you edit my comment instead of replying it. My point is there is new allocation created during comprehensive-bufferization(https://github.com/nod-ai/iree-amd-aie/blob/main/compiler/plugins/target/AMD-AIE/iree-amd-aie/Transforms/Passes.cpp#L76), so in any case we need to do hoisting after comprehensive-bufferization. https://github.com/nod-ai/iree-amd-aie/blob/main/compiler/plugins/target/AMD-AIE/iree-amd-aie/Transforms/Passes.cpp#L294. What you added before bufferization is redundant.

As I replied before, I understand that there is allocation during comprehensive bufferization and that this PR is redundant in the sense that this bias allocation will be lowered later, my point is that the IR is more complex without it (most allocations hoisted, 1 which hasn't, until comprehensive bufferization). But I will close as you don't seem to agree :)