extendr / extendr

R extension library for rust designed to be familiar to R users.
https://extendr.github.io
MIT License
415 stars 42 forks source link

Support for u32 slice removed - workaround #788

Open jonlachmann opened 1 month ago

jonlachmann commented 1 month ago

In my package I have the following code which takes a dgC-matrix from the Matrix R package and creates a sparse matrix in faer which does not copy the data, but instead creates a reference to it:

fn load_dgc_matrix<'a>(mat: S4, x: &'a Robj) -> faer::sparse::SparseColMatRef<'a, u32, f64> {
    use extendr_api::AsTypedSlice;
    let i = mat.get_slot("i").unwrap();
    let p = mat.get_slot("p").unwrap();
    let dim = mat.get_slot("Dim").unwrap();

    let i = i.as_typed_slice().unwrap();
    let p = p.as_typed_slice().unwrap();
    let x = x.as_real_slice().unwrap();
    let dim = dim.as_integer_slice().unwrap();
    let ncols = dim[0] as _;
    let nrows = dim[1] as _;

    // Map the sparse matrix
    let symbolic_1 = faer::sparse::SymbolicSparseColMatRef::new_checked(nrows, ncols, p, None, i);
    let a_mat = faer::sparse::SparseColMatRef::<u32, f64>::new({
            symbolic_1
        },
        x
    );

    return a_mat;
}

This does not work in the latest version of extendr as support for u32 slices was removed. The solution (provided by @CGMossa) is to modify the code as follows, it may not be the best solution in terms of type safety etc. but it works. If it would be possible to provide a full function for handling sparse matrices from Matrix, that would be awesome, as that is probably a not too uncommon use case.

    let i: &[i32] = i.as_typed_slice().unwrap();
    let i: &[u32] = unsafe { std::mem::transmute(i) };
    let p: &[i32] = p.as_typed_slice().unwrap();
    let p: &[u32] = unsafe { std::mem::transmute(p) };
CGMossa commented 1 month ago

This was caused by changes made in #767.

A potential solution for this would be to provide checked conversion, see #789.

Ilia-Kosenkov commented 1 month ago

Checked conversion is computationally heavy and introduces iterations over input vector, unlike any other type conversion we have. I'd suggest to just reinterprer &[i32] as &[u32] with optional runtime check. That way the validation tax will clearly be visible in your code.

FYI @CGMossa bedore you merge anything

CGMossa commented 1 month ago

Checked conversion is computationally heavy and introduces iterations over input vector, unlike any other type conversion we have. I'd suggest to just reinterprer &[i32] as &[u32] with optional runtime check. That way the validation tax will clearly be visible in your code.

FYI @CGMossa bedore you merge anything

Thanks. I'm well aware of your concern about semantics and the API. I've made a PR where I have put this conversion somewhere else that seems more appropriate. Please comment there as to what you think.

For what it's worth, I don't think checking one sign-bit is that costly, however, I fully agree that it should not reside in AsTypedSlice.