tenstorrent / tt-metal

:metal: TT-NN operator library, and TT-Metalium low level kernel programming model.
Apache License 2.0
396 stars 48 forks source link

WH: linear topology of 2 n300s hangs at init #9518

Open pgkeller opened 2 months ago

pgkeller commented 2 months ago

tt-metal/tt_metal/.umd/cluster_desc.yaml shows: chips: { 0: [0,0,0,0], 1: [0,0,0,0], 2: [1,0,0,0], 3: [1,0,0,0], }

Initialization fails w/: ibc++abi: terminating due to uncaught exception of type std::runtime_error: TT_ASSERT @ ../tt_metal/llrt/tt_cluster.cpp:641: tunneled_device_hit || (it == device_ids.end()) info: Loop Exit Error. backtrace: --- void tt::assert::tt_assert<char [17]>(char const, int, std::1::basic_string<char, std::__1::char_traits, std::1::allocator> const&, bool, char const, char const (&) [17]) --- tt::Cluster::get_tunnels_from_mmio_device(int) const --- tt::DevicePool::init_firmware_on_active_devices() const --- build_debug/test/tt_metal/unit_tests_fast_dispatch(+0x4196a) [0x55cbaa0ff96a]

pgkeller commented 3 weeks ago

Use machine 172.27.28.27 (e09cs07) re-configured for linear topology