tenstorrent / tt-metal

:metal: TT-NN operator library, and TT-Metalium low level kernel programming model.
https://docs.tenstorrent.com/ttnn/latest/index.html
Apache License 2.0
485 stars 79 forks source link

Bring raw L1 read through FD and use it for reading profiler control buffer #15015

Open mo-tenstorrent opened 1 week ago

mo-tenstorrent commented 1 week ago

Because CCL ops basically grab a hold of eth cores, UMD R/W are unusable on all remote chips, if any chip is running a CCL op.

This make dump device calls hang on async runs that call dump before syncing all devices.

Syncing all devices is not a requirement for profiler and if we move to an FD based read, we don't require device sync.

pgkeller commented 3 days ago

we have a workaround, right? can I drop this to P2?

mo-tenstorrent commented 3 days ago

Yes Thanks