tenstorrent / tt-metal

:metal: TT-NN operator library, and TT-Metalium low level kernel programming model.
Apache License 2.0
430 stars 59 forks source link

BERT-Large model fails from main on Wormhole N300 #9946

Closed vshenoyTT closed 3 months ago

vshenoyTT commented 3 months ago

Describe the bug BERT demo test fails from the main branch.

To Reproduce Steps to reproduce the behavior:

  1. Checkout to main
  2. Run the command: 'pytest --disable-warnings --input-path="models/demos/wormhole/stable_diffusion/demo/input_data.json" models/demos/wormhole/stable_diffusion/demo/demo.py::test_demo`
  3. See the error: FAILED models/demos/metal_BERT_large_11/demo/demo.py::test_demo[models/demos/metal_BERT_large_11/demo/input_data.json-1-batch_7] - RuntimeError: TT_FATAL @ ../tt_eager/tt_dnn/op_library/bmm/multi_core_reuse_mcast_2d_optimized/bmm_op_multi_core_reuse_mcast_2d_opt..

Expected behavior The PyTest passes.

Please complete the following environment information:

tt-aho commented 3 months ago

The repro command is incorrect and is for SD.

Looking at the error, it seems like the command would be pytest models/demos/metal_BERT_large_11/demo/demo.py::test_demo[models/demos/metal_BERT_large_11/demo/input_data.json-1-batch_7].

I do not see an error and this passes for me on 6cf8607f70afc79e0cf501cd6e912eb17c8f78c5

vshenoyTT commented 3 months ago

Apologies, I attached the wrong command to the Issue. However, I was using the command for Bert, not Bert Large. The test has now passed on my end.