Closed yutannihilation closed 1 year ago
Now I think I've figured out what these should be. But..., it feels a bit too massive.
https://github.com/yutannihilation/string2path/blob/main/configure.win https://github.com/yutannihilation/string2path/blob/main/configure
Yes, it's absurd that every package using cargo is supposed to ship and maintain this amount of code. It would b nice if R/CRAN could in some form support Rust extension packages. Not sure what needs to happen for that to become a possibility.
Some lines can be omitted, but I guess this is the reality... I'll write a blog post about this so that we can discuss what options might be acceptable to us. Hopefully soon.
I'll write a blog post about this so that we can discuss
I wrote, but it seems I still need some more time to figure out how to move this issue forward... This is unnecessarily lengthy so I don't really recommend to read, but l post here for reference only.
https://yutani.rbind.io/post/2021-09-21-writing-a-configure-script-for-an-r-package-using-rust/
If we are to keep this, we need to provide tooling for developers for upgrading this script if the new binary is released.
Another thing is I'd prefer to have various configurations in DESCRIPTION file (like roxygen), but I doubt we can read values from there during compilation.
upgrading this script if the new binary is released.
A simple solution would be defining the tag and the expected hash in a separate file e.g. tools/hashes.sh
. The configure scripts can source it so that the developers don't need to update the whole configure
and configure.win
.
Another thing is I'd prefer to have various configurations in DESCRIPTION file (like roxygen), but I doubt we can read values from there during compilation.
Interesting idea, thanks. I think "${RSCRIPT}" -e 'read.dcf("./DESCRIPTION")'
should work. What kind of configuration do you want?
I managed to squash most of the actual logics into one file and settings.
Now configure
is only this long, and the developers don't need to update this usually. This is not yet very portable because of the undocumented convention on the locations of precompiled binary, but this feels a bit better now.
# In configure scripts, Rscript cannot be used without path specified
RSCRIPT="${R_HOME}/bin/Rscript"
# Load the variables (e.g. github repo, tag, and checksum)
. ./tools/vars.sh
# Load common functions (e.g. check_cargo)
. ./tools/common_functions.sh
echo "NOT_CRAN: ${NOT_CRAN}"
# Even when `cargo` is on `PATH`, `rustc` might not. This should ensure PATH is
# configured correctly (c.f. https://github.com/yutannihilation/string2path/issues/4).
if [ -e "${HOME}/.cargo/env" ]; then
. "${HOME}/.cargo/env"
fi
# Check if `cargo` command exists ----------------------------------------------
check_cargo
# If cargo is confirmed fine, exit here. But, even if the cargo is not available
# or too old, it's not the end of the world. There might be a pre-compiled
# binary for the platform.
if [ $? -eq 0 ]; then
sed -e 's|@CLEAN_EXTRA@|$(STATLIB)|' src/Makevars.in > src/Makevars
exit 0
fi
# Download the precompiled binary ----------------------------------------------
download_binaries
sed -e 's|@CLEAN_EXTRA@||' src/Makevars.in > src/Makevars
exit 0
Interesting idea, thanks. I think "${RSCRIPT}" -e 'read.dcf("./DESCRIPTION")' should work. What kind of configuration do you want?
Sorry I forgot to answer.
I was thinking if it is possible to put all the metadata about repository name, version, checksums in the description file?
I understand we can have several binaries per platform and it may bloat the file, if all of the checksums are in the DESCRIPTION
file.
read.dcf
seems to work fine. One possibility is that we list all the variables that need to be set (e.g., your vars.sh
). DESCRIPTION
of a package can be easily accessed, so consumers can verify what are they actually installing.
Your common_functions.sh
looks insane, so many things have to be done. I wonder if we can somehow reduce it in size.
A wild idea: AFAIK dependencies are installed before package compilation, what if we make a package or add code to {rextendr}
that does all of this? Then the only thing needed is just calling a function from the package, like {RSCRIPT} -e "rextendr::assist_compilation()"
, plus perhaps an optional parameter (like a path to DESCRIPTION or package root).
what if we make a package or add code to {rextendr} that does all of this?
Yeah, I too wondered if we can go in this direction. It sounds good to me. A bonus is that the functions can be tested in usual testthat tests.
One side-effect is that we are going to be responsible for the dependent packages. We might need to be more careful to release new versions. I think it's a matter of time, though.
Well, we are responsible for maintaining these build scripts that you use in your package and suggest implementing here. My issue is it is much, much harder to maintain and test shell-ish scripts (especially cross-platform) than working with R code.
What I propose is if someone scaffolds a new package using rextendr
, we put rextendr
as dependency, write additional info into Rextendr
field of DESCRIPTION
and then use Rscript
to call a method from rextendr
, which should be available on the system before compilation (because it is a dependency). I am 95% sure CRAN will scream at us for doing such tricks, but perhaps it is worth trying out.
Also, we may even save on scripts and inject this step into Makevars
-- it will be just one line.
Ah, yes, we are. Sorry, I wasn't very clear. I'll later clarify about my concern about maintaining the backward-compatibility.
For using DEDCRIPTION, I don't worry much. Do you have any particular CRAN policy violation in your mind?
No, but it is CRAN. Anything unorthodox can trigger a bad response in my opinion. As an argument -- I honestly have only seen Roxygen using its own field in the DESCRIPTION
. I would definitely check if we can pass R CMD check --as-cran
with additional fields first.
I agree to some extent, but, this time, I believe this usage of DESCRIPTION
is legal. Writing R Extensions says:
There is no restriction on the use of other fields not mentioned here (but using other capitalizations of these field names would cause confusion). Fields Note, Contact (for contacting the authors/developers) and MailingList are in common use. Some repositories (including CRAN and R-forge) add their own fields.
So there's nothing that stops us to add our own fields, at least. For example, ggplot2 has uncommon Config/Needs/website
, which is a convention of pkgdown.
It's still unclear if there's some rule on referring to the package's own file in configure script, but I think there's nothing.
Regarding our responsibility to maintain extendr, I think there are some possible degrees of guaranty that we can provide.
Cargo.lock
can guarantee.I think we all agree with option 1, and disagree that we are at the stage of option 3.
How about option 2? As I wrote above, I think it's possible if the user's package solely depends on Rust crates extendr-*
. The crate versions can be pinned in Cargo.toml
. But, in reality, an R package depends on
Makevars
and configure
We are almost powerless on the change of the R API, but we can do better with the build scripts. We should not break the scripts that currently work fine. This means, if we make the scripts depend on rextendr package, it might be very difficult to make breaking changes on such functions like rextendr::assist_compilation()
. I'm fine either it's a shell script or an R script, but, for now, it should be vendored to avoid the frequent breakage.
But, for now, such a package should be only my package. So, I'm not that strongly against the idea.
I found it a bit difficult to put everything in R script, partly because I stick with zero-dependencies and I wanted an extra check of verifying the checksums (and of course my poor skill of R code writing), but I guess the main reason is the absence of nice way of communicating errors between R and shell script. If we want to handle the errors differently and nicely, we'll have a hard time, like I did...
settings: https://github.com/yutannihilation/string2path/blob/af28a0b89234cc5ae7ec91855015a3cc141fab6d/DESCRIPTION#L29-L41 code: https://github.com/yutannihilation/string2path/blob/main/tools/configure.R
Anyway, while the code is messy, I think I'm ready to submit my package to CRAN soon to see if this is accepted by them. Are there anything we want to try?
Looks better to me. I am not entirely sure about the structure of Fields in DESCRIPTION
(especially naming), but we can review it later after your submission passes all the checks (good luck btw).
Thanks. For naming, it seems some r-lib packages use Config/foo/bar
fields like testthat's Config/testthat/edition
, so I followed this manner. It will probably never conflicts with other package's convention if the field contains the package name.
c.f. https://github.com/cran/gifski/blob/master/configure