tenstorrent / tt-metal

:metal: TT-NN operator library, and TT-Metalium low level kernel programming model.
Apache License 2.0
396 stars 49 forks source link

Hang on reallocate of sharded tensor on n300 #7858

Open xanderchin opened 4 months ago

xanderchin commented 4 months ago

Hang on reallocate of sharded tensor.

Passes on n150, hang on n300.

pytest tests/ttnn/unit_tests/operations/test_reallocate.py::test_ttnn_reallocate[num_allocs=2-mem_config=tt::tt_metal::MemoryConfig(memory_layout=TensorMemoryLayout::BLOCK_SHARDED,buffer_type=BufferType::L1,shard_spec=std::nullopt)]
jliangTT commented 4 months ago

@tarafdarTT , is this something that is worth taking a look?

xanderchin commented 4 months ago

@jliangTT I actually have an update here. There was some user error (doh) that @TT-BrianLiu assertion/checks on main helped with.

There's still a hang that I'm seeing but I believe it's another case that should be illegal and asserted for.

I'll update reproduction details.