Closed kpaigwar closed 2 months ago
fyi @SeanNijjar @cglagovichTT @uaydonat @djordje-tt
Hey @kpaigwar, which type of machine is this?
Hey @SeanNijjar and @cfjchu , this is galaxy machine. I have added the unit test for repro this issue
pytest tests/ttnn/multichip_unit_tests/test_multidevice_TG.py::test_device_line_all_gather_8x4_data_async_issue
This will run two tests, you will see first test passing with async_mode turned off and second will run into seg_fault
Hey @kpaigwar, in case you're blocked here's a branch with changes rebased on top your branch that should work: https://github.com/tenstorrent/tt-metal/tree/asaigal/issue_11089 @cfjchu and I will properly uplift these changes to main.
Thanks @tt-asaigal for the update
@tt-asaigal and @cfjchu , the fix is working on our demo.
@kpaigwar fixes are now in main. Please retry and let us know if any issues.
@cfjchu, tried from main, issue has been resolved.
@cfjchu, tried from main, issue has been resolved.
great to hear - thx!
close?
close?
yes, it's closed
Description
After integrating new API changes for ttnn.line_all_gather, seeing the below error when enabling the device async on galaxy machine.
Error
Repro