Open saichandax opened 1 month ago
The data parallel implementation for distilbert is completed. Corresponding PR #13158
The pipeline for the distilbert data parallel is enabled, but the test fails when devices are initialized using fixtures from the conftest files. Previously, using the mesh_device fixture caused the test to hang on the n150 machine, while the device fixture caused it to hang on n300. Currently, when both fixtures are used, the test runs twice: one passes, and the other fails while closing the device(here). We are actively debugging this issue. The next step is to use the all_device fixture and verify the model.
Corresponding PR https://github.com/tenstorrent/tt-metal/pull/13158
cc: @boris-drazic @yieldthought
the pytest fixture for multi_device has been added to the conftest.py file due to compatibility issues when using device and mesh_device on the n300 and n150 devices. Corresponding PR https://github.com/tenstorrent/tt-metal/pull/13158