tenstorrent / tt-metal

:metal: TT-NN operator library, and TT-Metalium low level kernel programming model.
https://docs.tenstorrent.com/ttnn/latest/index.html
Apache License 2.0
485 stars 79 forks source link

[Bug Report] ... #15363

Open yieldthought opened 3 days ago

yieldthought commented 3 days ago

Describe the bug Multidevice tensors do not work in comparison mode

To Reproduce Enable comparison mode for a multidevice ttnn program:

        with ttnn.manage_config("enable_comparison_mode", True):

Expected behavior Comparison mode works.

Screenshots

models/demos/llama3/tt/llama_mlp.py:82: in forward
    w1_out = ttnn.linear(
ttnn/ttnn/decorators.py:626: in __call__
    output = self.decorated_function(*function_args, **function_kwargs)
ttnn/ttnn/decorators.py:550: in call_wrapper
    output = decorated_function(*function_args, **function_kwargs)
ttnn/ttnn/decorators.py:408: in call_wrapper
    local_golden_function_args, local_golden_function_kwargs = self.preprocess_golden_function_inputs(
ttnn/ttnn/decorators.py:194: in default_preprocess_golden_function_inputs
    new_arg = recursive_preprocess_golden_function_inputs(arg)
ttnn/ttnn/decorators.py:185: in recursive_preprocess_golden_function_inputs
    return ttnn.to_torch(object_value)
ttnn/ttnn/decorators.py:626: in __call__
    output = self.decorated_function(*function_args, **function_kwargs)
ttnn/ttnn/decorators.py:504: in call_wrapper
    return decorated_function(*function_args, **function_kwargs)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

tensor = ttnn.Tensor([[[[ 1.63281, -1.25781,  ...,  0.69531,  1.28906],
               [-1.03125, -1.70312,  ..., -1.60938,  1....69,  ...,  0.08740,  1.25781]]]], shape=Shape([1, 2, 1024, 4096]), dtype=DataType::BFLOAT16, layout=Layout::ROW_MAJOR)

    @ttnn.register_python_operation(name="ttnn.to_torch", golden_function=_golden_function)
    def to_torch(
        tensor: ttnn.Tensor,
        *,
        torch_rank: Optional[int] = None,
        mesh_composer: Optional[ttnn.MeshToTensor] = None,
        device: Optional[ttnn.Device] = None,
        cq_id: Optional[int] = 0,
    ) -> "torch.Tensor":
        """
        Converts the `ttnn.Tensor` tensor into a `torch.Tensor`. It does not call to_layout for bfloat8_b or bfloat4_b as we now convert
        to tile layout during tensor.to_torch().

        Args:
            tensor (ttnn.Tensor): the input tensor.

        Keyword Args:
            torch_rank (int, optional): Desired rank of the `torch.Tensor`. Defaults to `None`.
                Will use `torch.squeeze` operation to remove dimensions until the desired rank is reached. If not possible, the operation will raise an error.
            mesh_composer (ttnn.MeshToTensor, optional): The desired `ttnn` mesh composer. Defaults to `None`.
            device (ttnn.Device, optional): The `ttnn` device of the input tensor. Defaults to `None`.
            cq_id (int, optional): The command queue ID to use. Defaults to `0`.

        Returns:
            torch.Tensor: The converted `torch` tensor.

        Example:
            >>> ttnn_tensor = ttnn.from_torch(torch.randn((2,3)), dtype=ttnn.bfloat16)
            >>> torch_tensor = ttnn.to_torch(ttnn_tensor)
            >>> print(torch_tensor)
            tensor([[-0.3008, -0.8438,  0.3242],
                    [ 0.9023, -0.5820,  0.5312]], dtype=torch.bfloat16)
        """
        if ttnn.is_tensor_storage_on_device(tensor):
            tensor = ttnn.from_device(tensor, cq_id=cq_id)

        if (tensor.layout != ttnn.ROW_MAJOR_LAYOUT) and not (
            tensor.dtype == ttnn.bfloat8_b or tensor.dtype == ttnn.bfloat4_b
        ):
            tensor = tensor.to(ttnn.ROW_MAJOR_LAYOUT, device)

        shape_without_tile_padding = tuple(tensor.shape)
        if tensor.storage_type() == ttnn.DEVICE_STORAGE_TYPE:
            raise RuntimeError("ttnn.Tensor cannot be on device when converting to torch.Tensor!")
        if (tensor.layout != ttnn.ROW_MAJOR_LAYOUT) and not (
            tensor.dtype == ttnn.bfloat8_b or tensor.dtype == ttnn.bfloat4_b
        ):
            raise RuntimeError("ttnn.Tensor has to be in ROW_MAJOR Layout to be converted to torch.Tensor")
        if mesh_composer:
            return mesh_composer.compose(tensor)
>       tensor = tensor.to_torch()
E       RuntimeError: TT_THROW @ ../ttnn/cpp/pybind11/pytensor.cpp:526: tt::exception
E       info:
E       Tensor MultiDeviceHostStorage cannot be converted to torch directly. Use composer(..) functionality.

Please complete the following environment information: Internal ird t3k system

Additional context This would be super useful for tracking down PCC bugs but we can't use it on N300+ i.e. all the models we care most about.

yieldthought commented 3 days ago

@ayerofieiev-tt who should this be assigned to?