Open joelsmithTT opened 2 days ago
Can you check in few benchmarks you have? It would be nice thing to have in the repo. Later on we can add that as perf tests on CI
@pjanevskiTT - see commit aa49e27cd874f896bfb8f65b4f04574a75422ca7
for both code and message about the caveats.
Folks working on the Metal project have noticed the poor performance attributes of UMD's generic IO code paths. A recommendation has been to use arrays rather than hash tables. While I agree that this could reduce the performance overhead, my view is that the performance problems UMD has here are a symptom of a bigger problem. A comprehensive solution will tie in with this work: https://github.com/tenstorrent/tt-umd/issues/273
UMD has an existing workaround where the application can perform one-time construction of a Writer
object and hang onto it for fast MMIO writes. This is a hack: by encapsulating a pointer to an offset of the PCIe bar, the Writer
makes the assumption that the underlying window (which is mapped to a NOC destination) won't change underneath it (or, if it does change, that the change was initiated by the application).
Some hacky benchmarking reveals the following numbers on one of my Wormhole systems. Release build, with dynamic TLB path.
0x1000 byte write to Tensix L1, measured inside pci_device.cpp ~230 nanoseconds 0x1000 byte write to Tensix L1, measured at tt_SiliconDevice interface: ~2980 nanoseconds 0x1000 byte read from Tensix L1, measured inside pci_device.cpp: ~1123483 nanoseconds 0x1000 byte read from Tensix L1, measured at tt_SiliconDevice interface: ~1149822 microseconds
0x4 byte write to Tensix L1, measured inside pci_device.cpp: 20 nanoseconds 0x4 byte write to Tensix L1, measured at tt_SiliconDevice interface: 1360 nanoseconds 0x4 byte read from Tensix L1, measured inside pci_device.cpp: 1150 nanoseconds 0x4 byte read from Tensix L1, measured at tt_SiliconDevice interface: 1850 nanoseconds
There is opportunity for improvement here, by making the code between the UMD interface and the underlying write do less.