r-lib / rray

Simple Arrays
https://rray.r-lib.org
GNU General Public License v3.0
130 stars 12 forks source link

Discussion) Subsetting with higher dimensional indices #89

Open DavisVaughan opened 5 years ago

DavisVaughan commented 5 years ago

Note) This would make [ dimensionality-unstable, but it would always be dimensionality stable when using vectors, and would only actually allow the dimensionality to increase, never decrease.

Julia has a neat subsetting syntax that would let you do:

x <- matrix(1:9, 3)
x
#>      [,1] [,2] [,3]
#> [1,]    1    4    7
#> [2,]    2    5    8
#> [3,]    3    6    9

idx_row <- matrix(c(1, 2), nrow = 1)
idx_row
#>      [,1] [,2]
#> [1,]    1    2

# x[idx_row, 2]
array(c(4, 5), c(1, 2, 1))
#> , , 1
#> 
#>      [,1] [,2]
#> [1,]    4    5

# essentially this is:
# normal subset of :
# res <- x[c(1, 2), 2]
# a reshape view afterwards of:
# reshape(res, {1, 2, 1})
# where {1, 2, 1} comes from the
# dimensions of the indexers (1, 2) and (1)

idx_col <- matrix(c(2, 3), nrow = 1)
idx_col
#>      [,1] [,2]
#> [1,]    2    3

# x[idx_row, idx_col]
array(c(4, 5, 7, 8), c(1, 2, 1, 2))
#> , , 1, 1
#> 
#>      [,1] [,2]
#> [1,]    4    5
#> 
#> , , 1, 2
#> 
#>      [,1] [,2]
#> [1,]    7    8

# essentially this is:
# normal subset of :
# res <- x[c(1, 2), c(2, 3)]
# a reshape view afterwards of:
# reshape(res, {1, 2, 1, 2})
# where {1, 2, 1, 2} comes from the
# dimensions of the indexers (1, 2) and (1, 2)

Created on 2019-04-17 by the reprex package (v0.2.1.9000)

Its like a subset + reshape in one go

DavisVaughan commented 5 years ago

This is sort of related to advanced indexing in numpy, I think

DavisVaughan commented 5 years ago

Actually I think this is dimensionality stable. All that means is that you must be able to predict the dimensionality of the output from the type of the input (vctrs type). The type includes the 2nd+ dimensions, but not the size.

So:

<array[,2]>[<integer>, ] should always return the same thing, whether <integer> is length 1 or 2 (base R does not), but <array[,2]>[<integer[,2]>,] would be allowed to return something different, because the indexer has a different type