nod-ai / iree-amd-aie

IREE plugin repository for the AMD AIE accelerator
Apache License 2.0
69 stars 30 forks source link

Add hardware-aware canonicalization #874

Closed jtuyls closed 2 weeks ago

jtuyls commented 2 weeks ago

Canonicalization of the offsets/strides/sizes in doubly-strided operations, like DMA ops, can lead to overflow of the number of available bits in the hardware buffer descriptor fields. This PR adds logic to the AMDAIECanonicalizeDoublyStridedOpPass to not canonicalize if it would lead to such an overflow.

Note that there is more work to be done on avoiding overflow of available hardware bits and making it more robust, which is not addressed in this PR:

newling commented 2 weeks ago

Can you please clarify these points in the summary:

Moving to transaction based control code generation, as aie-rt will throw errors if hardware field bits are exceeded.

Throwing errors if hardware resources are exceeded sounds like a good thing, so I'm interpreting this as: moving to transaction based control code generation means moving to using aie-rt. Is that correct?

A pass that can decanonicalize offset/strides/sizes access patterns, for example if available bits are already before canonicalization.

Is this about undoing preexisting overflow?

jtuyls commented 2 weeks ago

Can you please clarify these points in the summary:

Moving to transaction based control code generation, as aie-rt will throw errors if hardware field bits are exceeded.

Throwing errors if hardware resources are exceeded sounds like a good thing, so I'm interpreting this as: moving to transaction based control code generation means moving to using aie-rt. Is that correct?

Yes, using aie-rt APIs and the transaction data structure to generate the control code transaction instead of the manual way we're doing currently, which doesn't perform (m)any checks: https://github.com/nod-ai/iree-amd-aie/blob/20867183c610ec870344d074c4be78e0aaac1515/compiler/plugins/target/AMD-AIE/iree-amd-aie/Transforms/AMDAIEControlCodeToTransaction.cpp#L33

A pass that can decanonicalize offset/strides/sizes access patterns, for example if available bits are already before canonicalization.

Is this about undoing preexisting overflow?

Yes, if the initial strides/sizes generated from the pack would overflow for example. Typically, this shouldn't happen, but in principal it could.

jtuyls commented 2 weeks ago

@newling I addressed the comments, could you check again?