torch / torch7

http://torch.ch
Other
8.97k stars 2.38k forks source link

a suspected bug occurs when resize operations are after expanding tensors #1053

Closed MlWoo closed 6 years ago

MlWoo commented 7 years ago
th> a= torch.rand(1,1)
                                                                      [0.0001s] 
th> a
 0.9636
[torch.DoubleTensor of size 1x1]
th> b = a:expand(1,4)
                                                                      [0.0002s] 
th> b
 0.9636  0.9636  0.9636  0.9636
[torch.DoubleTensor of size 1x4]
th> c = b:resize(2,2)
                                                                      [0.0005s] 
th> c
  9.6356e-01   1.0374e-95
 1.7456e+238   1.4818e-76
[torch.DoubleTensor of size 2x2]

                                                                      [0.0002s] 

I have known that it expand your tensor without any memory allocation. But the result will be weird when the subsequent operations are related to memory copy such as resize. There are 2 questions I am concerned. 1, I Is it a bug or wrong usage.

  1. The stride of a dimension will be set to 0 when you use expand operation to a tensor along that dimension. I want to know the your rules to handle the situation that stride is 0 because I am developing a more efficient torch on CPU. Thanks a lot.
yzhuang commented 7 years ago

Hi MlWoo,

I think this is not a bug.
torch.expand() creates a view with 0 strides. torch.resize() resizes the underlying storage.

th> a = torch.rand(1,1)
                                                                      [0.0000s]
th> a:storage()
 0.1934
[torch.DoubleStorage of size 1]

                                                                      [0.0002s]
th> b = a:expand(1,4)
                                                                      [0.0010s]
th> b:storage()
 0.1934
[torch.DoubleStorage of size 1]

                                                                      [0.0002s]
th> c = b:resize(2,2)
                                                                      [0.0001s]
th> c:storage()
  1.9344e-01
  2.3204e+77
 3.2379e-318
 2.7813e-309
[torch.DoubleStorage of size 4]

EDIT: reading your comment again. I do agree that this behavior is actually confusing.

yzhuang commented 7 years ago

Another thing that might help: resize() never copies memory, so I'd expect the current behavior. reshape() always copies memory.

In your case, if you want a more intuitive behavior, maybe you can try using reshape instead.

th> a = torch.rand(1,1)
                                                                      [0.0001s]
th> a
 0.5587
[torch.DoubleTensor of size 1x1]

                                                                      [0.0001s]
th> a:expand(1,4):resize(2,2)
  5.5868e-01 -2.6868e+154
  0.0000e+00   0.0000e+00
[torch.DoubleTensor of size 2x2]

                                                                      [0.0002s]
th> a:expand(1,4):reshape(2,2)
 0.5587  0.5587
 0.5587  0.5587
[torch.DoubleTensor of size 2x2]