tensil-ai / tensil

Open source machine learning accelerators
https://www.tensil.ai
Other
356 stars 29 forks source link

Fix matmul / acc delay #52

Closed tdb-alcorn closed 2 years ago

tdb-alcorn commented 2 years ago

This change fixes a bug where matmuls and accs would wait on each other instead of proceeding in parallel. The problem was that the control queues feeding them were implicitly linked, meaning that one could not get too far ahead of the other. We add buffers in these places to cut dependencies between buses fed by the same control inputs. In this case the buffer size was sometimes set to 2x the array size to account for the fact that accumulate operations can't begin until data has propagated through the array.