tenstorrent / tt-metal

:metal: TT-NN operator library, and TT-Metalium low level kernel programming model.
Apache License 2.0
459 stars 68 forks source link

update prefetcher/dispatcher to move large globals to stack, increase stack size #11138

Closed pgkeller closed 1 month ago

pgkeller commented 2 months ago

This is pending more data on kernel stack usage

Prefetcher/dispatcher put a table in global memory (copied to local) that chews up a lot of global space. For this, we shrunk the stack size. However, there are kernels that put large tables on the stack. Having some kernels to A and others B leads to out-of-local-memory errors, we need consistent guidelines.

Plan is to make the guidelines to be to put these items on the stack. That is more clear and, for trisc* necessary.

pgkeller commented 2 months ago

This is now POR. The large tmp buffers in prefetch/dispatch need to go on the stack, stack size should be increased as needed (based on watcher). Then stack size needs to bump up to match the data @tt-dma collected

pgkeller commented 1 month ago

@tt-asaigal is this done?

tt-asaigal commented 1 month ago

@tt-asaigal is this done?

Yes, changes for this were merged to main here: https://github.com/tenstorrent/tt-metal/commit/df4d0131f9f4b455c92b63d40c113eff2d658e02.

I can close the issue.