Closed yieldthought closed 7 months ago
Yep several ops are affected, I'll fix them all:
$ grep -A1 -r 'TT_FATAL(input_tensor_a.device() == input_tensor_b.device()' *
tt_eager/tt_dnn/op_library/transformer_tms/transformer_tms.cpp: TT_FATAL(input_tensor_a.device() == input_tensor_b.device(), "Operands to matmul need to be on the same device!");
tt_eager/tt_dnn/op_library/transformer_tms/transformer_tms.cpp- TT_FATAL(input_tensor_a.buffer() != nullptr and input_tensor_b.buffer() != nullptr, "Operands to matmul need to be allocated in buffers on device!");
--
tt_eager/tt_dnn/op_library/transformer_tms/transformer_tms.cpp: TT_FATAL(input_tensor_a.device() == input_tensor_b.device(), "Operands to matmul need to be on the same device!");
tt_eager/tt_dnn/op_library/transformer_tms/transformer_tms.cpp- TT_FATAL(input_tensor_a.buffer() != nullptr and input_tensor_b.buffer() != nullptr, "Operands to matmul need to be allocated in buffers on device!");
--
tt_eager/tt_dnn/op_library/bmm/bmm_op.cpp: TT_FATAL(input_tensor_a.device() == input_tensor_b.device(), "Operands to matmul need to be on the same device!");
tt_eager/tt_dnn/op_library/bmm/bmm_op.cpp- TT_FATAL(input_tensor_a.buffer() != nullptr and input_tensor_b.buffer() != nullptr, "Operands to matmul need to be allocated in buffers on device!");
--
tt_eager/tt_dnn/op_library/bmm/bmm_op.cpp: TT_FATAL(input_tensor_a.device() == input_tensor_b.device(), "Operands to matmul need to be on the same device!");
tt_eager/tt_dnn/op_library/bmm/bmm_op.cpp- TT_FATAL(input_tensor_a.buffer() != nullptr and input_tensor_b.buffer() != nullptr, "Operands to matmul need to be allocated in buffers on device!");
--
tt_eager/tt_dnn/op_library/eltwise_binary/eltwise_binary_op.cpp: TT_FATAL(input_tensor_a.device() == input_tensor_b.device(), "Operands to eltwise binary need to be on the same device!");
tt_eager/tt_dnn/op_library/eltwise_binary/eltwise_binary_op.cpp- TT_FATAL(input_tensor_a.buffer() != nullptr and input_tensor_b.buffer() != nullptr, "Operands to eltwise binary need to be allocated in buffers on device!");
--
tt_eager/tt_dnn/op_library/bcast/bcast_op.cpp: TT_FATAL(input_tensor_a.device() == input_tensor_b.device(), "Operands to bcast need to be on the same device!");
tt_eager/tt_dnn/op_library/bcast/bcast_op.cpp- TT_FATAL(input_tensor_a.buffer() != nullptr and input_tensor_b.buffer() != nullptr, "Operands to bcast need to be allocated in buffers on device!");
Describe the bug In operations:primary::MatMul::validate the order of these lines is incorrect:
Calling
tensor.device()
with a null buffer will segfault.To Reproduce
Note: you will need to reset the device after this crash.
Expected behavior A TT_FATAL that input a has a null buffer (or something like that)
Screenshots
Please complete the following environment information:
Additional context This might affect other ops that use this assert order too.