extendr / extendr

R extension library for rust designed to be familiar to R users.
https://extendr.github.io
MIT License
426 stars 43 forks source link

Failure when calling `as_typed_slice_mut()` on an empty vector `Robj` #518

Open multimeric opened 1 year ago

multimeric commented 1 year ago
>> test!{
  let matrix = Vec::<u32>::new().into_iter().collect_rarray([0, 2]);
  call!("print", pairlist!(matrix));
}
thread '<unnamed>' panicked at 'called `Result::unwrap()` on an `Err` value: Other("Unknown error in converting to slice")', /home/migwell/.cargo/registry/src/github.com-1ecc6299db9ec823/extendr-api-0.4.0/src/robj/into_robj.rs:48:13
stack backtrace:
   0: rust_begin_unwind
             at /rustc/90743e7298aca107ddaa0c202a4d3604e29bfeb6/library/std/src/panicking.rs:575:5
   1: core::panicking::panic_fmt
             at /rustc/90743e7298aca107ddaa0c202a4d3604e29bfeb6/library/core/src/panicking.rs:65:14
   2: core::result::unwrap_failed
             at /rustc/90743e7298aca107ddaa0c202a4d3604e29bfeb6/library/core/src/result.rs:1791:5
   3: ctx::run_user_code_8::{{closure}}::test
   4: std::panicking::try
   5: run_user_code_8
   6: evcxr::runtime::Runtime::run_loop
   7: evcxr::runtime::runtime_hook
   8: evcxr::main
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
Ilia-Kosenkov commented 1 year ago

Isn't the size of your array effectively 0?

multimeric commented 1 year ago

Yes. Should that be issue? I basically want it to do what R does here, which is have a 0-length vector but also some attributes.

Ilia-Kosenkov commented 1 year ago

To me this is somewhat weird. What is the purpose of [0, 2] matrix if it can never be instantiated (since the size of matrix is 0 * 2= 0 total elements). Here is what I get in R for this

 integer(0) |> matrix(c(0, 2)) |> attributes()
#> $dim
#> [1] 0 0

And this makes more sense to me. Still a [0,0] matrix is useless.

multimeric commented 1 year ago

You're using matrix wrong. The second and third arguments are the dimensions as scalars, not the second argument as a length-2 vector:

> numeric(0) |> matrix(ncol=2) |> attributes()
$dim
[1] 0 2

Still a [0,0] matrix is useless.

It's not useless. This relates to an issue that was requested on Discord recently. A user can conceivably want to return a data structure of the right shape but with no values, so as to not break downstream code.

Ilia-Kosenkov commented 1 year ago

True, my mistake. I agree we need to match R's behavior here.

multimeric commented 1 year ago

Okay. This is (surprise!) caused by a pretty deep issue in extendr. We have implemented the Rust -> R ToVectorValue trait for u32 (and various other unsigned types) to create a double in R:

https://github.com/extendr/extendr/blob/0f2a70a800c011f60b1fcccb8f7a11525b1a9d0e/extendr-api/src/robj/into_robj.rs#L169-L171

However, we don't implement AsTypedSlice<u32> for REALSXP: https://github.com/extendr/extendr/blob/0f2a70a800c011f60b1fcccb8f7a11525b1a9d0e/extendr-api/src/robj/mod.rs#L784-L794

So in the code for array creation, namely Andy's original new_matrix() functions, and my new collect_rarray(), using a u32, usize etc will always (?) fail because it converts from Rust to Robj using ToVectorValue, and then fails to convert back again using AsTypedSlice. I don't think this actually has anything to do with the vector being length 0.

Fixing this is tricky though. I think I will have to add a new macro that adds another code path for AsTypedSlice for these 3 unsigned types, whereby it checks the sexptype, and supports either REALSXP or INTSXP.

Ilia-Kosenkov commented 1 year ago

Hm. So ToVectorValue collects whatever we have in Rust to R object, effectively making a copy (which is OK). In this case, I suspect collecting u32 into REALSXP is justified. But AsTypedSlice basically matches R data with Rust type, creating a 'view'. I see no reason why REALSXP should have an u32 view. How'd you implement that?

multimeric commented 1 year ago

I guess your criticism is that AsTypedSlice is supposed to be a "free", zero-copy operation, and my proposal is to perform an actual conversion + copy from REALSXP to u32? I take your point, although do you not see the issue with our lossy conversion where u32 -> REALSXP -/> u32?

Should I maybe just disallow creating arrays using these unsigned types, because they're weird in this way?

multimeric commented 1 year ago

It's actually pretty tricky to find trait bounds that would forbid this, because I already have

where
        Self::Item: ToVectorValue,
        Robj: AsTypedSlice<'a, Self::Item>

The issue is just that these two conversions aren't actually reciprocal.

One thing I'm wondering is if this line actually makes sense: https://github.com/extendr/extendr/blob/0f2a70a800c011f60b1fcccb8f7a11525b1a9d0e/extendr-api/src/robj/mod.rs#L786

Does it make sense to take a slice of an integer vector as a &[u32]? What if the numbers are negative? Because of course if I delete this line, then the trait bounds forbid the operation that fails here.