extendr / rextendr

An R package that helps scaffolding extendr-enabled packages or compiling Rust code dynamically
https://extendr.github.io/rextendr/
Other
196 stars 28 forks source link

Develop R package with name not equal to `package.name` in `Cargo.toml` #152

Closed mlondschien closed 2 years ago

mlondschien commented 2 years ago

I have a rust crate called hdcd and would like to build an API for R using rextendr. The resulting R package should also be called hdcd. For this I create a new rust package hdcdr, which wraps hdcd and implements some From<...> for Robj traits. Note that this wrapping crate cannot have hdcd as a package name, as it has hdcd as a dependency.

I have been unable to configure the package such that the R-package name (should be hdcd) is different from the package.name entry in Cargo.toml. If I just set the Package: entry in the DESCRIPTION to hdcd, I get the following error upon rextendr::document():

Error in .Call(wrapper_function, use_symbols = use_symbols, package_name = package_name,  : 
  "wrap__make_hdcd_wrappers" not available for .Call() for package "hdcd"
Error: Failed to generate wrapper functions.
✖ callr subprocess failed: "wrap__make_hdcd_wrappers" not available for .Call() for package "hdcd"

Here is the equivalent issue for pyo3: https://github.com/PyO3/maturin/issues/313#issuecomment-784438527.

Following the advice there, I added a lib.name = 'hdcd' entry to Cargo.toml. Then, I get the following error:

   x86_64-conda-linux-gnu-cc -shared -L/home/mlondschien/miniforge3/envs/rust/lib/R/lib -Wl,-O2 -Wl,--sort-common -Wl,--as-needed -Wl,-z,relro -Wl,-z,now -Wl,--disable-new-dtags -Wl,--gc-sections -Wl,-rpath,/home/mlondschien/miniforge3/envs/rust/lib -Wl,-rpath-link,/home/mlondschien/miniforge3/envs/rust/lib -L/home/mlondschien/miniforge3/envs/rust/lib -o hdcd.so entrypoint.o -L./rust/target/release -lhdcdr -L/home/mlondschien/miniforge3/envs/rust/lib/R/lib -lR
   /home/mlondschien/miniforge3/envs/rust/bin/../lib/gcc/x86_64-conda-linux-gnu/9.4.0/../../../../x86_64-conda-linux-gnu/bin/ld: cannot find -lhdcdr
   collect2: error: ld returned 1 exit status
   make: *** [/home/mlondschien/miniforge3/envs/rust/lib/R/share/make/shlib.mk:10: hdcd.so] Error 1
   ERROR: compilation failed for package ‘hdcd’

I feel like this might be fixed by monkeypatching the PKG_LIBS entry in the Makevars file from PKG_LIBS = -L$(LIBDIR) -lhdcdr to PKG_LIBS = -L$(LIBDIR) -lhdcd. Doing so I get back to the first error:

Error in .Call(wrapper_function, use_symbols = use_symbols, package_name = package_name,  : 
  "wrap__make_hdcd_wrappers" not available for .Call() for package "hdcd"
Error: Failed to generate wrapper functions.
✖ callr subprocess failed: "wrap__make_hdcd_wrappers" not available for .Call() for package "hdcd"

Any suggestions on how to proceed?

clauswilke commented 2 years ago

You'd have to manually call the function to generate wrappers, which can be given a separate name for the Rust name and the R package: https://github.com/extendr/rextendr/blob/175509ce4401950fa881a82ccfee1e455a555ffa/R/register_extendr.R#L118-L134

We don't currently expose this functionality in rextendr::document() because it'd likely be more confusing than helpful, but this could be done in principle.

For now, the only help I can give is to say read the source of rextendr::document(), see what it does, and see if you can trace all the locations where the R package name is used but you'd want to use the Rust module name instead. I don't think there's any technical reason that would preclude you from doing what you want to do.

mlondschien commented 2 years ago

Thanks for your response. I made changes to a local installation of rextendr as suggested by you. See https://github.com/extendr/rextendr/compare/main...mlondschien:issue-152?expand=1

With this change, rextendr::document correctly writes R/extendr_wrappers.R. However, the call still fails:

─  DONE (hdcd)
✔ Writing 'R/extendr-wrappers.R'.
ℹ Updating hdcd documentation
ℹ Loading hdcd
Error in getDLLRegisteredRoutines.DLLInfo(dll, addNames = FALSE) : 
  must specify DLL via a “DLLInfo” object. See getLoadedDLLs()

with traceback

> traceback()
9: stop(gettextf("must specify DLL via a %s object. See getLoadedDLLs()", 
       dQuote("DLLInfo")), domain = NA)
8: getDLLRegisteredRoutines.DLLInfo(dll, addNames = FALSE)
7: assignNativeRoutines(dlls[[lib]], lib, env, nsInfo$nativeRoutines[[lib]])
6: load_dll(path)
5: pkgload::load_all(path, helpers = FALSE, attach_testthat = FALSE)
4: load_code(base_path)
3: roxygen2::roxygenise(pkg$path, roclets)
2: devtools::document(pkg = pkg, roclets = roclets, quiet = FALSE)
1: rextendr::document()

Here devtools::document() fails.

I can import hdcd with require(hdcd), however none of the actual functions get imported:

> ls("package:hdcd")
character(0)

Furthermore, _R_CHECK_CRAN_INCOMING_=false R CMD check --as-cran . fails with

(rust) ~/code/rust/hdcd/hdcd-r (rename) $ cat /home/mlondschien/code/rust/hdcd/hdcd-r.Rcheck/00install.out
* installing *source* package ‘hdcd’ ...
** using staged installation
** libs
rm -Rf hdcd.so ./rust/target/release/libhdcdr.a entrypoint.o
x86_64-conda-linux-gnu-cc -I"/home/mlondschien/miniforge3/envs/rust/lib/R/include" -DNDEBUG   -DNDEBUG -D_FORTIFY_SOURCE=2 -O2 -isystem /home/mlondschien/miniforge3/envs/rust/include -I/home/mlondschien/miniforge3/envs/rust/include -Wl,-rpath-link,/home/mlondschien/miniforge3/envs/rust/lib   -fpic  -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O2 -ffunction-sections -pipe -isystem /home/mlondschien/miniforge3/envs/rust/include -fdebug-prefix-map=/home/conda/feedstock_root/build_artifacts/r-base-split_1630154298835/work=/usr/local/src/conda/r-base-4.1.1 -fdebug-prefix-map=/home/mlondschien/miniforge3/envs/rust=/usr/local/src/conda-prefix  -c entrypoint.c -o entrypoint.o
cargo build --lib --release --manifest-path=./rust/Cargo.toml
    Finished release [optimized] target(s) in 0.02s
x86_64-conda-linux-gnu-cc -shared -L/home/mlondschien/miniforge3/envs/rust/lib/R/lib -Wl,-O2 -Wl,--sort-common -Wl,--as-needed -Wl,-z,relro -Wl,-z,now -Wl,--disable-new-dtags -Wl,--gc-sections -Wl,-rpath,/home/mlondschien/miniforge3/envs/rust/lib -Wl,-rpath-link,/home/mlondschien/miniforge3/envs/rust/lib -L/home/mlondschien/miniforge3/envs/rust/lib -o hdcd.so entrypoint.o -L./rust/target/release -lhdcd -L/home/mlondschien/miniforge3/envs/rust/lib/R/lib -lR
installing to /home/mlondschien/code/rust/hdcd/hdcd-r.Rcheck/00LOCK-hdcd-r/00new/hdcd/libs
** R
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** testing if installed package can be loaded from temporary location
Error: package or namespace load failed for ‘hdcd’ in library.dynam(lib, package, package.lib):
 shared object ‘hdcdr.so’ not found
Error: loading failed
Execution halted
ERROR: loading failed
* removing ‘/home/mlondschien/code/rust/hdcd/hdcd-r.Rcheck/hdcd’

(the file hdcdr.so does not exist. It's called hdcd.so.)

My understanding of packaging in the R world is rather limited. I would like to publish the R API to the rust package to CRAN. How would you recommend to proceed?

Ilia-Kosenkov commented 2 years ago

Can you create a minimal reproducible example of the problem? A branch in the repository or maybe a dummy project that mimics the hdcd package/crate structure? Such that we can download it and run rextendr::document() locally and inspect produced artifacts? I believe it will speed up the debugging process.

Compilation through {rextendr} is a complex process, some configurations are controlled by {rextendr} and can be (in principle) easily modified, some configs are set up by extendr-* crate, and some are inferred by {roxygen2}, {devtools}, etc. The process is designed to work well with the default setup (i.e., if you wrap hdcd crate, make hdcdr package and import hdcd as a dependency to hdcdr Rut library, or that created by the recommended rextendr::use_extendr()), and we never tested custom naming of underlying Rust library.

My understanding of packaging in the R world is rather limited. I would like to publish the R API to the rust package to CRAN. How would you recommend to proceed?

My personal opinion, which may not reflect the opinions of other extendr/rextendr contributors. 1. Create a regular Rust library (lets call it `hdcd`) and implement all the functionality you need there. Expose through public API necessary types and functions. 2. Use `{usethis}` and `{rextendr}` together, something like this: ```R usethis::create_project("path/to/your/dev/folder/hdcdr") # last folder will be the name of the package # Make sure your workfolder was changed to .../hdcdr, you can check it using `getwd()` and update using `setwd()` rextendr::use_extendr() # This will add all the necessary extendr infrastructure to your project rextendr::document() # Run it twice ``` 3. Verify that you can call dummy Rust function `hello_world()` exported from Rust 4. Go to Rust source of your R package, add reference to `hdcd` and define exported wrappers for all functions that you want to export from `hdcd`. Use `#[extendr]` attribute to decorate functions that should be processed by `extendr`. 5. Use `extendr_module!` macro to finalize the list of types/functions exported to R. Go back to R and run `rextendr::document()` to verify that compilation succeeds. 6. Iterate through 4 and 5 until you export everything and compilation succeeds. 7. In R, create tests using `usethis::use_testthat()`, add tests (you can use `uesthis::use_test()` for that or do it manually), write tests to verify your Rust library is correctly called through Rust and R wrappers. 8. Run `R CMD check` or `rcmdcheck::rcmdcheck()` and see if your package satisfies all the requirements of CRAN. 9. There is no straightforward way to submit `{rextendr}`-based package to CRAN -- people there seem to not like Rust, but @yutannihilation has some experience of submitting packages and passing all checks there. Feel free to check our Discord where we discussed the submission process. 10. If you do not care about CRAN, your package (if `R CMD check` succeeds) should be installable directly from your github repository, so you can then obtain it using, e.g., `remotes::install_github("you/hdcdr")`, providing your machine has Rust toolchain, Rtools (if running on Windows), and `R_HOME` env var set (AFAIK, on Windows may also require `RTOOLS40_HOME`).
mlondschien commented 2 years ago

Thanks @Ilia-Kosenkov.

A (not minimal) example is this: https://github.com/mlondschien/hdcd-rs/. The root directory contains the rust package, The R package which uses rextendr is in hdcd-r. R CMD CHECK and rextendr::document() both work, you can also have a look at CI / tests. Here is a PR which renames the R package: https://github.com/mlondschien/hdcd-rs/pull/55.

I will also try to provide a minimal example.

Thanks for your suggestions on how to develop this. If I understand correctly the package linked to above already implements steps until including 8 (I don't expect you to check this). Where does the issue for a deviating name of the R package and the crate package name get addressed?

here is no straightforward way to submit {rextendr}-based package to CRAN

This is a pity. I do care about submitting there. Thanks, I'll reach out to @yutannihilation via discord.

Ilia-Kosenkov commented 2 years ago

I will try to take a look at your example later today. Perhaps you have a use-case that is important for us to support but we oversaw it.

mlondschien commented 2 years ago

This should serve as a minimal reproducible example: https://github.com/mlondschien/rand. I would like to rename the R package to rand.

Perhaps you have a use-case that is important for us to support but we oversaw it.

I see two use-cases:

Ilia-Kosenkov commented 2 years ago

One wants to wrap an existing crate / build an R API. It seems natural to me for the R API to have the same package name as the rust crate. One wants to build his/her own cargo crate with an R API. If he/she wants to publish the crate on crates.io, it would be nice to exclude the extendr dependency for the main crate and use a wrapping crate in the R-package. Again, it seems natural for the (main) rust crate and the R-package to have the same name.

It is somewhat customary to come up with clever names including R letter when creating R packages. For wrapper packages, I can give you a few examples, e.g. {RSQLite} for SQLite, {rjags} for JAGS, {rstan} for STAN. For non-wrapper packages, {dplyr}, {tidyr}, {withr}, {rlang}, {readr}, etc (my personal favorites are {purrr} and {furrr}). So I believe we thought that for any rust crate named rustcrate there will be a wrapper named rrustcrate or rustcrater or something like this, i.e. the name of Rust wrapper would always differ from the name of the library being wrapped. We ourselves follow this pattern, first creating extendr Rust crate, and then rextendr R package (both times adding an r letter).

However, this rule is not enforced, so we should look into providing the possibility of having R package and Rust crate with different names.

clauswilke commented 2 years ago

Yeah, when I originally set up {rextendr} there was a reason I didn't just call it {extendr}. I thought about it carefully at the time but don't remember the details.

I do think thought it's technically possible to give the rust library that comes with a package a different name than the package name, and we should support this if somebody wants to dig through the code and figure out how to implement it. But be aware that rextendr::document() would never be able to figure this out automatically, so there will be some usability and maintenance issues.

mlondschien commented 2 years ago

But be aware that rextendr::document() would never be able to figure this out automatically

Why not? Can we not just read the package.name entry in the Cargo.toml as here: https://github.com/extendr/rextendr/compare/main...mlondschien:issue-152?expand=1?

clauswilke commented 2 years ago

Ok, maybe it's possible. It's not high on my priority list to work this out, but I'm happy to review a PR.

mlondschien commented 2 years ago

Thanks. Note that with these changes the following issues prevail: https://github.com/extendr/rextendr/issues/152#issuecomment-971421647.

Ilia-Kosenkov commented 2 years ago

So, I played with your rand sample. The wrappers that we generate externally -- using extendr -- look something like this

# Generated by extendr: Do not edit by hand
#
# This file was created with the following call:
#   .Call("wrap__make_randr_wrappers", use_symbols = FALSE, package_name = "rand")

#' Return random number between -2147483648 and 2147483647.
#' @export
rand <- function() .Call("wrap__rand", PACKAGE = "rand")

You can get it by calling

rextendr:::make_wrappers_externally("randr", "rand", "test.R", ".")

Notice the first argument did not affect the generated bindings. However, if you manually change PACKAGE = "randr" (PACKAGE here references R package), then source("test.R"), then your rand() call will work even though the R package is now called rand.

So, we need this support on extendr side and possibly on {rextendr} as well. https://github.com/extendr/extendr/blob/77b780a735c1e29bdcfb722b68ae14f68c66d11f/extendr-api/src/metadata.rs#L117-L174 This is I think the code fragment responsible for writing wrappers. It only understands package_name, but we should probably have package_name and crate_name.

On {rextendr}, the call into special extendr function that returns correct metadata happens here https://github.com/extendr/rextendr/blob/175509ce4401950fa881a82ccfee1e455a555ffa/R/register_extendr.R#L157-L176 It is similar a .Call() with PACKAGE = "rand", which is incorrect (needs to be "randr"). So this also has to be changed.

Ilia-Kosenkov commented 2 years ago

Oh wait, here is the actual method https://github.com/extendr/rextendr/blob/175509ce4401950fa881a82ccfee1e455a555ffa/R/register_extendr.R#L133-L150 So module_name is different from crate name. So we definitely need to distinguish module_name, crate_name and pacakge_name.

The overall change is rather complex, as it spans several projects. It also requires additional tests to ensure we correctly support such scenario. I am unsure if we are going to do it anytime soon (our backlog is full of more important things, including re-working of extendr type system for the next release). However, if you feel like you can help us, I (and others) can guide you through the development process and review your contributions.

mlondschien commented 2 years ago

Thanks a lot @Ilia-Kosenkov!

I still don't understand the details involved here. You mention

There is also

However, if you manually change PACKAGE = "randr" (PACKAGE here references R package)

I do not understand this. I want my (R-)package to be called rand, not randr, why should I set PACKAGE="randr" in that case?

I believe I made it work somehow, see https://github.com/mlondschien/rand/pull/1. Here I use

Running

cargo build --lib --release --manifest-path=./rust/Cargo.toml --target-dir ./rust/target
R CMD build .
_R_CHECK_CRAN_INCOMING_=false R CMD check --as-cran --no-vignettes --no-manual rand_*.tar.gz

seems to work :partying_face:.

Ilia-Kosenkov commented 2 years ago

@mlondschien, Yeah, I probably messed up some names. You can get some insight from the templates that are used to scaffold the {rextendr}-based package. Anyway, without verifying how all these names relate to each other we cannot make any changes to {rextendr}.

P.S. : module_name is indeed name of the module in Rust. We should be able to support multiple modules though I believe we do not use this feature that often. Here you can see that we export several modules and then test the usage. This test project is outdated and I am not sure if we have an instrument to resolve name collisions (same function names in different modules). That is a topic for a dfferent issue.

Ilia-Kosenkov commented 2 years ago

Perhaps we should explicitly specify lib.name. If lib_name is equal to package_name, we resolve the linking issue at compilation time. Then probably package_name and crate_name can be different.

mlondschien commented 2 years ago

seems to work :partying_face:.

One detail though: rextendr::document() does not work, as this will overwrite the extendr-wrappers.R.