cornell-zhang / allo

Allo: A Programming Model for Composable Accelerator Design
https://cornell-zhang.github.io/allo
Apache License 2.0
122 stars 14 forks source link

[Feature] Rewind memory access loops #143

Closed redbudgithubsec closed 5 months ago

redbudgithubsec commented 6 months ago

Issue Currently the loops for accessing top level variables that are automatically generated with pipelining at II=1 which is great. However, in my testing this can still lead to 10x the theoretical runtime for 2d arrays.

Solution Adding rewind to the end of the automatically generated pipeline pragmatism fully solves this performance issue while sometimes also reducing hardware usage.

Example - My matrix vector multiply program. Without rewind (current setup): image 78 cycle interval for buf1

With rewind manually added: image 4 cycle interval achieved.

redbudgithubsec commented 6 months ago

I'm sorry this is Zack, I'm just on the wrong account.