xtensor-stack / xtensor-r

R bindings for xtensor
BSD 3-Clause "New" or "Revised" License
87 stars 15 forks source link

Usefulness of 0D in R #91

Closed DavisVaughan closed 5 years ago

DavisVaughan commented 5 years ago

I'm currently a bit skeptical of the usefulness of this line https://github.com/QuantStack/xtensor-r/blob/0c60b49c14467c060c77b7f6da130f3fa0418cb3/include/xtensor-r/rcontainer.hpp#L75

I totally get what you guys are trying to do by treating a vector of length 1 as a 0-D array, and I get that you want to be consistent with the rest of xtensor. That said, we really don't have the concept of a 0-D array (or even a scalar) in R. Everything is a vector. I think that somehow implies everything is at least 1-D.


Removing that line makes a few results in particular more intuitive to me.

For example:

rray_op_unary_1_arg_cpp("diag", x = 1L, 0L) # crashes

rray_op_unary_1_arg_cpp("diag", x = as.array(1L), 0L)

     [,1]
[1,]    1

Calling xt::diag(1, k = 0) crashes if you pass in 1L, but is fine if you pass in as.array(1L) (because it has a dim = 1 attribute). (both 1L and as.array(1L) are coerced first to rarray objects before they are passed to diag())

I would expect the same result from both, and you indeed get the same result if you remove the line I mention above.


Another example is this one with accumulators https://github.com/QuantStack/xtensor/issues/1335

I understand that with 0-D arrays you should not be able to call cumsum() with an axis argument specified, but it makes sense in R where 1L is not really thought of as 0-D, and is more like a 1-D array of size 1.

rray_op_unary_1_arg_cpp("cumsum", as.array(1L), 0L)
# [1] 1

rray_op_unary_1_arg_cpp("cumsum", 1L, 0L) # fails with a thrown error

Again, I expect the same result for both of these, and you get that with the above mentioned line removed.


Overall it just seems like with the way R works, this is likely going to lead to more confusion than anything else. At least for me in rray, I will have to handle the case where a user passes in a vector of length 1 specially so that we don't crash R because of it being interpreted as a 0-D array.

Thoughts?

DavisVaughan commented 5 years ago

Would you be open to a PR on this? I've crashed R twice now because of this. I just don't see it ever being useful for R users. Here's another example:

library(rray)

# the default, "all" axes
rray_sum(1, axes = NULL)
#> [1] 1

# this should work IMO
rray_sum(1, axes = 1)
#> Error in rray_reducer_cpp(reducer, x, as_cpp_idx(axes)): allocator<T>::allocate(size_t n) 'n' exceeds maximum supported size

# If this works, ^ should work too
rray_sum(1:2, axes = 1)
#> [1] 3

Created on 2019-03-25 by the reprex package (v0.2.1.9000)

SylvainCorlay commented 5 years ago
# this should work IMO
rray_sum(1, axes = 1)

Gotcha, although you would need to offset by one before of the 0-indexing of C++.