[TorchScript Usability] tensor printout format differs from eager mode

penguinwu commented 3 years ago

Test case: test-tensor-printout-differ.py

> python test-tensor-printout-differ.py 
tensor([1., 1., 1., 1., 1., 1.])
Eager: True
 1
 1
 1
 1
 1
 1
[ CPUFloatType{6} ]
TorchScript: True

The TorchScript prints out 1-D tensors vertically instead of horizontally as in eager. Also elements of float tensor should be in float format (i.e., 1. instead of 1).

Please also check if multi-dimensional tensors are printed out in the right format.

Related to #50444

cc @gmagogsfm

gautamborad commented 3 years ago

Hi @penguinwu / @gmagogsfm , I would like to work on this issue. I looked at https://github.com/pytorch/pytorch/blob/master/torch/csrc/jit/OVERVIEW.md, but am not sure which class to start with. Any pointer would be helpful. Thanks!

gmagogsfm commented 3 years ago

Hi @gautamborad,

Thanks for contributing to PyTorch! I think this is where the logic of prim::Print is implemented, it determines print format of everything (including Tensor): https://github.com/pytorch/pytorch/blob/3113a1de4ac75bb397911ab1ae0bd3e98de89e03/torch/csrc/jit/runtime/register_prim_ops.cpp#L730

Ideally, we would like to improve this print format to be as close to eager print as possible (maybe even exactly the same). Let us know if you need additional help or with code reviews. Thanks again!

gautamborad commented 3 years ago

Thanks @gmagogsfm, for the pointers. I was able to figure out that the print format for TorchScript is implemented here:

https://github.com/pytorch/pytorch/blob/e56d3b023818f54553f2dc5d30b6b7aaf6b6a325/aten/src/ATen/core/Formatting.cpp#L235

@penguinwu , you are right, we will have to fix the formatting for multi-dimensional tensors also:

import torch
from typing import Any

def f(a : Any):
    print(a)
    return (isinstance(a, torch.Tensor))

one_d = torch.ones([2])
two_d = torch.ones([2,2])
three_d = torch.ones([2,2,2])
m = torch.jit.script(f)

print("Eager:")
print("----")
print("one_d:"); f(one_d)
print("two_d:"); f(two_d)
print("three_d:"); f(three_d)
print("")
print("TorchScript:")
print("----")
print("one_d:"); m(one_d)
print("two_d:"); m(two_d)
print("three_d:"); m(three_d)

Eager:
----
one_d:
tensor([1., 1.])
two_d:
tensor([[1., 1.],
        [1., 1.]])
three_d:
tensor([[[1., 1.],
         [1., 1.]],

        [[1., 1.],
         [1., 1.]]])

TorchScript:
----
one_d:
 1 1
[ CPUFloatType{2} ]
two_d:
 1  1
 1  1
[ CPUFloatType{2,2} ]
three_d:
(1,.,.) = 
  1  1
  1  1

(2,.,.) = 
  1  1
  1  1
[ CPUFloatType{2,2,2} ]

Some questions:

Do we need to match the python output exactly? a. Like the python output has a tensor() wrapper around the values, but the TorchScript does not. b. TorchScript has the type and the dimensions printed at the end ([ CPUFloatType{2,2} ]), which the python output does not. c. TorchScript does not have the square bracket formatting [], which the python has.
Can you please point me to some existing test cases for the print format, if there are any.

Thanks!

gautamborad commented 3 years ago

Hi @gmagogsfm / @penguinwu, a reminder to have a look at the comment above. Thanks!

penguinwu commented 3 years ago

TorchScript prints out a bit more information on the type, shape, device of a tensor, this information can be useful for debugging. So I would suggest keeping such meta information in the new printout but use Python's format for the tensor values.

Here is what I have in mind (new printout):

one_d:
([1., 1.])
[ CPUFloatType{2} ] 
two_d:
([[1., 1.],
        [1., 1.]])
[ CPUFloatType{2,2} ] 
three_d:
([[[1., 1.],
         [1., 1.]],

        [[1., 1.],
         [1., 1.]]])
[ CPUFloatType{2,2,2} ]

@gmagogsfm Are you aware of any test cases that rely on specific tensor print formats? Or which test cases should be run to ensure we are not breaking any tests?

gautamborad commented 3 years ago

@gmagogsfm / @penguinwu, Please review the WIP PR 60033. Also please add relevant people for the review. Thanks! CC: @zdevito / @ezyang

pytorch / pytorch

[TorchScript Usability] tensor printout format differs from eager mode #51138