mlverse / torch

R Interface to Torch
https://torch.mlverse.org
Other
500 stars 68 forks source link

Number of dimensions of a tensor increases when indexing a dimension of length 1 with a one-dimension logical vector #1174

Closed gavril0 closed 5 months ago

gavril0 commented 6 months ago

Here is the issue in a nutshell

> x <- torch_zeros(1) 
> x
torch_tensor
 0
[ CPUFloatType{1} ]
> x[TRUE]
torch_tensor
 0
[ CPUFloatType{1,1} ]

The 1d tensor becomes 2d when indexing the first dimension with TRUE.

This happens in other cases such as

> torch_zeros(1,2)[TRUE,]
torch_tensor
(1,.,.) = 
  0  0
[ CPUFloatType{1,1,2} ]

Indexing with a logical vector does not add a dimension when the indexed dimension is longer than 1:

> torch_zeros(2)[c(TRUE,FALSE)]
torch_tensor
 0
[ CPUFloatType{1} ]

though I would have perhaps expected a 0d tensor in this case, for consistency with

> torch_zeros(2)[1]
torch_tensor
0
[ CPUFloatType{} ]

Using a numerical index with drop=FALSE also does not add (or drop) dimension:

> torch_zeros(1)[1,drop=FALSE]
torch_tensor
 0
[ CPUFloatType{1} ]
mohamed-180 commented 5 months ago

I think indexing must used by tensors not any data types , I mean we can use :

> torch_zeros(1,2)[torch_tensor(TRUE),]
torch_tensor
 0  0
[ CPUFloatType{1,2} ]

And

> x <- torch_zeros(1) 
> x[torch_tensor(TRUE)]
torch_tensor
 0
[ CPUFloatType{1} ]
gavril0 commented 5 months ago

Thanks for pointing out that the problem does not occur when using bolean tensor for indexing.

After reading your comment, I went to look in package website and/or in the book but I did not find any information/guidelines on using booleans for indexing. Since using logical expressions is very natural in R (e.g. x[y>0]), I think that it is important to point out that it is necessary to cast the result of the logical expression (y>0) as a tensor if it is actually the case.

However, as a user, I think that one should be able to use a R logical vectors and logical expressions to index torch tensors in the same way as we can already use R numerical vectors for indexing. Moreover, using R logical vectors already works except for the case pointed out in this issue, which does not gives the expected result. It seems weird that one should cast all R expressions that yield a boolean vector that is used to index a tensor just to avoid this specific problem. Perhaps there are other issues that I am not aware of?

Note I have tried to find out in the code about the Tensor class what can explain the fact that torch_tensor(1)[TRUE] yield a 2d tensor CPUFloatType{1,1} but it is not clear to me where the casting from a R type to a tensor type occurs.

mohamed-180 commented 5 months ago

I think it is in indexing.R . @dfalbel can this solve this problem :

tensor_slice <- function(tensor, ..., drop = TRUE) {
  Tensor_slice(torch_tensor(tensor), environment(), drop = drop, mask = .d)
}

for https://github.com/mlverse/torch/blob/10ec53b67d2a409812913d0f03212adb4b071f64/R/indexing.R#L55

gavril0 commented 5 months ago

@mohamed-180, as you suggested, I have converted the R logical vectors into tensors to avoid the issue with indexing.

@dfalbel. Is this issue a bug or is there any reason for it?

Note that it makes my code messier because I have now to use tensors not only for the index but also for R vectors that are being indexed since

(1:3)[torch_tensor(c(TRUE, FALSE, TRUE))] 

gives an invalid subscript type 'externalptr' error. All the "logic" about selecting the samples in the batch must be implemented with torch tensors while I would have preferred using torch tensors only to store the states and/or parameters of the torch modules.

More generally, it seems that I keep bumping into small issues that occur because, fundamentally, torch tensors and R vectors are different (see also #1164). So I understand that it is not always obvious how to define operators should behave in places that mix R vector and torch tensors. A bit more documentation, in particular with respect to indexing and slicing when R vectors and torch tensors are mixed, indications on possible performance hits when converting R vector to torch tensors and vice versa, implicit conversions linked to redefinition of standard operators, the interface between R vectors and "scalar" torch tensors , would be very useful and appreciated.

dfalbel commented 5 months ago

This indeed looks like a bug to me. We should treat R scalar logical vectors the same as vectors of logical vectors. To be fair, we are reproducing the same behavior as torch, which copied numpy as to what to do with scalars booleans. But doesn't look like this is used in many places. See eg:

https://stackoverflow.com/a/75828333/3297472