Closed eitsupi closed 7 months ago
For the GitHub release a windows/Mac/linux x64 user can install without Rtools/make/gcc/clang and it is faster.
Otherwise looks interesting. We could add the such binaries to release on GitHub and hardcode link to the appropriate versioned binary inside each R source package to avoid version mismatch. Edit (it seems this is what was done here also)
We might risk sometime in future it would fail if URL changes.
So as far I understand this. This solution is very related to our current cross compilation in the way the Makevars will download the rust binary right? It still requires some Make and a compiler and linker to wrap up the package, right?
Would it be possible to write a portable configure / configure.win which literally starts a new R install.packages call and installs the GitHub binary package if a compatible file is found for the user machine. After installation, the outer installation is aborted in a silent way. Then no Make+compiler+linker is needed for the non-cross realeases.
It is possible to start a direct binary installation via configure, and it actually installs. I use exit 1
to just stop the rest of normal compilation/installation, but this will trigger R to cleanup the installed package. If there was a way to make R gracefully skip the remaning installation. That could be very cool, I think.
> remotes::install_github("pola-rs/r-polars",ref = "configure_binary_install")
Downloading GitHub repo pola-rs/r-polars@configure_binary_install
── R CMD build ────────────────────────────────────────────────────────────────────────────────────────────────────────
✔ checking for file ‘/private/var/folders/v1/b2c26lpn2yjd997jg_gn4fgc0000gn/T/Rtmp5qh4qb/remotes97f87a372e00/pola-rs-r-polars-f8d88b8/DESCRIPTION’ ...
─ preparing ‘polars’:
✔ checking DESCRIPTION meta-information ...
─ cleaning src
─ checking for LF line-endings in source and make files and shell scripts (876ms)
─ checking for empty or unneeded directories
─ building ‘polars_0.8.1.9000.tar.gz’
* installing *source* package ‘polars’ ...
** using staged installation
[1] "trying to install directly binary package!!!!"
[1] "installing directly from download binary package"
Installing package into ‘/Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/library/00LOCK-polars/00new’
(as ‘lib’ is unspecified)
trying URL 'https://github.com/pola-rs/r-polars/releases/latest/download/polars__x86_64-apple-darwin20.tgz'
Content type 'application/octet-stream' length 18172308 bytes (17.3 MB)
==================================================
downloaded 17.3 MB
=== Hi there, throwing an error here on purpose here ===
ERROR: configuration failed for package ‘polars’
* removing ‘/Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/library/polars’
* restoring previous ‘/Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/library/polars’
Warning message:
In i.p(...) :
installation of package ‘/var/folders/v1/b2c26lpn2yjd997jg_gn4fgc0000gn/T//Rtmp5qh4qb/file97f83cad40af/polars_0.8.1.9000.tar.gz’ had non-zero exit status
We might risk sometime in future it would fail if URL changes.
I think the most worrisome thing about this is that we are installing from the GitHub fork repository of extendr. If the repository or the branch disappear, r-polars will no longer be able to be built.
If we can't find the binary library, it's not a big problem because you just fall back to the source build.
It still requires some Make and a compiler and linker to wrap up the package, right?
Of course it is. So if we want to do a binary installation, we need something else.
We own the rpolars organization where the extendr fork lives, so I don't see why we should delete it. We could add branch protection to the critical repositories and branches. I guess it could about as secure as r-polars.
That said I think I would also personally maintain some forks just to be sure.
Historically, I think, it was not ideal to depend directly on extendr, due their scope / aim was just not the same as ours. We might critically need to an extra feature or cherry pick some new changes. It is of course not ideal to stray to far away from extendr either.
Maybe it would be possible to set up a fairly cheap cran like R package repository which redirect download to github releases.
What is the protocol for client interactions with a package repository? I wonder if a github.io page could act as R repository and redirect gitub release
https://blog.sellorm.com/2019/03/29/lifting-the-lid-on-cran/ https://blog.sellorm.com/2019/03/30/build-your-own-cran-like-repo/
I wonder if a github.io page could act as R repository
I think this repo for the pak package did that. https://github.com/r-lib/r-lib.github.io
Nice. I'm also impressed by your background knowledge :)
I haven't used it so I couldn't comment right away, but it seems that we can host binary packages on GitHub Pages using the drat
package by @eddelbuettel.
https://eddelbuettel.github.io/drat/vignettes/dratstepbystep/
Perhaps it's better to deploy binary packages for amd64 macOS and Windows with drat
, and recommend source + binary library installation for other platforms?
(pak
's binary releases are done by a dedicated function that is hard-coded the repository name prepared inside pak
, and it seems difficult to use that.)
I wonder if a github.io page could act as R repository
Yes.
That is one of the key ideas behind drat
:
username.github.io/drat
for such a repo, you then only need 'username' as we can autogenerate the rest of URLAlso note that r-universe is very similar in providing per-user repositories, and like it, drat
can host binaries and source (but in difference to r-universe) will not build them for you.
I'm wondering if we could use pre-built binaries on R-universe, and it seems to be possible by detecting the environment variable MY_UNIVERSE
. (https://github.com/r-universe-org/help/issues/75#issuecomment-1750197115)
Perhaps we can distribute SIMD-enabled binary packages via R-universe by configuring them to use pre-built binaries when this is detected.
Of course drat is attractive, but I don't see much benefit in continuing to distribute binaries with SIMD disabled in R-universe, so perhaps it would be worth incorporating an R-universe-specific configuration?
Those would be questions for Jeroen. I am not sure how much you can influence what/how he builds. And given how much he builds reliably it is compelling.
Those would be questions for Jeroen.
This is simply a matter of whether scripts such as configure in the polars package detect the environment variable MY_UNIVERSE
.
It is the same as prqlr detecting NOT_CRAN
and downloading the binary.
https://github.com/eitsupi/prqlr/blob/67aed8cb89997486c991ea76e337654381cd7635/configure#L36-L66
The R-universe builder does not complain about what R packages download, so it is even possible to download the Rust nightly toolchain, for example. (The reason we don't do that now is because we didn't know how to tell if it was on the R-universe or not.)
Sure, conditioning on MY_UNIVERSE
is easy enough and I do so in a package. I meant this more of a 'if you need details or want to clarify he is he one to ask' as in the 'how much you can influence' that we do not get to mod his yaml files. But fully agreed that configure
is a valid package-side hook.
I thought the same thing could be done here and have recently added that functionality to
prqlr
(eitsupi/prqlr#195). The arm64 architecture can also be supported due to cargo's excellent cross-compilation, and the glibc version should be negligible by selecting the musl target. (e.g. #86)
When I tried in #435, it seems that some tests cannot pass when building the Rust library with the musl target. https://github.com/pola-rs/r-polars/actions/runs/6598220293/job/17926536231?pr=435#step:12:144
So I think it's reasonable to lower the Ubuntu version and use the gnu target.
The arrow package will attempt to download a pre-built binary from the Internet if
NOT_CRAN=true
. https://arrow.apache.org/docs/11.0/r/articles/install.html#r-source-package-with-libarrow-binaryThis method also has the advantage that many features disabled by default (i.e., on CRAN) during source installation are enabled in the pre-built binaries.
I thought the same thing could be done here and have recently added that functionality to
prqlr
(eitsupi/prqlr#195). The arm64 architecture can also be supported due to cargo's excellent cross-compilation, and the glibc version should be negligible by selecting the musl target. (e.g. #86)Unlike the binary installation via GitHub releases that this repository currently offers, it has the advantage of being valid for installation from anywhere. Like:
For example, the installation on arm64 Linux from R-universe is as follows:
I would like to bring this same thing here. What do you think? The administrative disadvantage is the need for additional library versioning and release work. (Don't forget to raise the version on Cargo.toml when making changes on the Rust side.)