nod-ai / iree-amd-aie

IREE plugin repository for the AMD AIE accelerator
Apache License 2.0
69 stars 30 forks source link

[DistributeL1Allocations] Fix logic for conv and correctly propagate errors #918

Open newling opened 14 hours ago

newling commented 14 hours ago

There was what appeared to be a false positive error in CI. i.e. an op emits an error somewhere, but the end-to-end numerical test still passes:

{7764D0FA-0A0A-44F6-8247-7F984C7C5EC5}

I tracked this down and fixed it. 2 things:

1) error wasn't being propagated all the way to the signal pass failure 2) the convolution tiling doesn't use the approach where L1 result tensors are initially of shape <4x4x> and then distributed to tensors of shape <1x1x> but the logic wasn't quite right and it was trying to do this for conv.

1 was hiding 2, causing default behaviour which just happened to work.