tenstorrent / tt-metal

:metal: TT-NN operator library, and TT-Metalium low level kernel programming model.
Apache License 2.0
303 stars 26 forks source link

Full python tracy profiling on `resnet` crashes #5180

Open mo-tenstorrent opened 5 months ago

mo-tenstorrent commented 5 months ago

Running python -m tracy -m pytest models/demos/resnet/tests/test_perf_resnet.py::test_perf_bare_metal[20-0.015-25] on a Tracy only build causes the following:


E       info:
E       Statically allocated circular buffers in program {} clash with L1 buffers on core range {}. L1 buffer allocated at {} and static circular buffer region ends at {}
E       63
E       [(x=0,y=0) - (x=11,y=7)]
E       617152
E       669476
E       backtrace:
E        --- tt::tt_metal::EnqueueProgram(tt::tt_metal::CommandQueue&, tt::tt_metal::Program&, bool, std::optional<std::reference_wrapper<tt::tt_metal::Trace> >)
E        --- tt::tt_metal::operation::detail::run_device_operation(tt::tt_metal::operation::DeviceOperation const&, std::vector<tt::tt_metal::Tensor, std::allocator<tt::tt_metal::Tensor> > const&, std::vector<std::optional<tt::tt_metal::Tensor const>, std::allocator<std::optional<tt::tt_metal::Tensor const> > > const&, std::vector<std::optional<tt::tt_metal::Tensor>, std::allocator<std::optional<tt::tt_metal::Tensor> > > const&)
E        --- tt::tt_metal::operation::run(tt::tt_metal::operation::DeviceOperation const&, std::vector<tt::tt_metal::Tensor, std::allocator<tt::tt_metal::Tensor> > const&, std::vector<std::optional<tt::tt_metal::Tensor const>, std::allocator<std::optional<tt::tt_metal::Tensor const> > > const&, std::vector<std::optional<tt::tt_metal::Tensor>, std::allocator<std::optional<tt::tt_metal::Tensor> > > const&)
E        --- tt::tt_metal::operation::run_without_autoformat(tt::tt_metal::operation::DeviceOperation const&, std::vector<tt::tt_metal::Tensor, std::allocator<tt::tt_metal::Tensor> > const&, std::vector<std::optional<tt::tt_metal::Tensor const>, std::allocator<std::optional<tt::tt_metal::Tensor const> > > const&, std::vector<std::optional<tt::tt_metal::Tensor>, std::allocator<std::optional<tt::tt_metal::Tensor> > > const&)
E        --- tt::tt_metal::optimized_conv(tt::tt_metal::Tensor const&, tt::tt_metal::Tensor const&, std::optional<tt::tt_metal::Tensor const>, std::optional<tt::tt_metal::Tensor const>, std::vector<int, std::allocator<int> >, unsigned int, bool, bool, bool, MathFidelity, tt::tt_metal::OptimizedConvParallelizationConfig const&, tt::tt_metal::OptimizedConvBlockConfig const&, unsigned int, std::optional<tt::tt_metal::MemoryConfig>, std::optional<tt::tt_metal::DataType>, std::optional<std::array<unsigned int, 4ul> >)
E        --- /home/mmemarian/models/tt-metal/tt_eager/tt_lib/_C.so(+0x2340c9) [0x7f26e56950c9]
E        --- /home/mmemarian/models/tt-metal/tt_eager/tt_lib/_C.so(+0x1a7f76) [0x7f26e5608f76]```
mo-tenstorrent commented 1 week ago

BOS is now seeing this with tt-metal/models/demos/resnet/tests/test_perf_resnet_2cqs.py.

This is blocking python profiling on resnet models, hence the P1 status.