pola-rs / r-polars

Bring polars to R
https://pola-rs.github.io/r-polars/
Other
415 stars 36 forks source link

Build r-polars from Nix #54

Closed Sicheng-Pan closed 1 year ago

Sicheng-Pan commented 1 year ago

This PR includes a Nix flake and its corresponding lock file, which can be used for automatic and reproducible package construction. It can also be used to introduce a development shell containing R with rextendr, r-polars, and tidyverse. If you have a system with Nix installed and flake support enabled, you can simply call nix develop in your terminal to enter the development environment.

This PR also includes a bug fix for the Makevars files. I noticed that curdir=$(CURDIR) is actually a bash command setting the environment variable curdir, and $(curdir) will be evaluated to empty because curdir is not a Makevars variable. "$$curdir" will be evaluated to the desired path because it will be interpreted as a reference to the environment variable.

sorhawell commented 1 year ago

@Sicheng-Pan

try append to .Rbuildignore-file in root and the build passes R cmd check on my machine

^flake.nix$
^flake.lock$
sorhawell commented 1 year ago

Hi thank you for your PR. nix could be a good thing. However, I was only vaguely aware of nix, so I'm trying to understand: How, and how often should the lock file be maintained? What command should I use to install rpolars from scratch with nix?

Sicheng-Pan commented 1 year ago

If you have Nix installed in your system, please enable flake support first (since it is an experimental feature).

Then you can use nix develop github:Sicheng-Pan/r-polars to enter an environment with R and rpolars installed without explicitly cloning the source manually. If rpolars source code (with flake.nix) is present in the system, simply enter source code directory and call nix develop to enter the same environment. You can modify the flake to change what libraries are available in R. See NixOS Wiki for more information about the wrappers.

To update the lock, simply run nix flake update in the source directory. The lock only controls the version of the inputs to the flake, which affects the version of Nix, R, and Rust toolchains, so it should be updated when newer R/Rust are preferred. The nixpkgs input is currently redirected to a community fork since the official Nix and Rust nightly have minor incompatibility issue, and it should be changed to the official version after this is fixed.

Here is a minimal flake that can be used to construct the R environment. Simply make a new flake.nix file, paste the contents, and call nix run in the same folder to enter R repl:

{
  inputs = {
    nixpkgs.url = "github:nixos/nixpkgs/nixos-unstable";
    flake-utils.url = "github:numtide/flake-utils";
    rpolars.url = "github:Sicheng-Pan/r-polars";
  };

  outputs = { self, nixpkgs, flake-utils, rpolars }:
    flake-utils.lib.eachDefaultSystem (system:
      let pkgs = nixpkgs.legacyPackages.${system};
      in {
        packages.default = pkgs.rWrapper.override {
          packages = [ rpolars.packages.${system}.default ];
        };
      });
}
sorhawell commented 1 year ago

I tried to run (sry did not know how to enable flakes and nix-command simultanously)

nix develop github:Sicheng-Pan/r-polars --extra-experimental-features flakes --extra-experimental-features nix-command

and it installed on a 5 year old macbook while I watched starcraft2 matches on youtube.

some simple observations:

Sicheng-Pan commented 1 year ago

4Gb of dependencies to download before unpacking and installing.

Well basically it's fetching all required dependency, so at the worst you have one duplicate build system on your computer. If you're using Nix for everything then this is necessary and sufficient, and other projects relying on the same dependency will share these environment.

took 1 hour to build from scratch

This is indeed the case (<10min with M1), but most of the time is spent on cargo build. Admittedly Cargo may not behave nicely with Nix and it may take longer than necessary for successive build.

Storing the cache would likely maxout the github actions 10Gb limit.

I'm setting up a GitHub Action to build the cache and store it on Cachix. It will take a while but this makes it easier for other people to use Nix flake.

The average R analyst/data scientist would likely prefer a 20sec install of latest binary from a repository with a one-liner or from CRAN / Runiverse.

If this package is on CRAN then from my understanding it will automatically be available on NixPkgs, and anyone using R with Nix can simply fetch the binary without compilation (and I believe is true for most packages on CRAN). So I believe publishing this package on CRAN would be helpful.

The average rpolars developer would probably install rust anyways to interact with the compilation process.

Currently the flake development environment also includes nightly Rust and rextendr, so you can also use rextendr::document to compile the source in R repl. You can also add devtools (and other packages you would like to use) to the R environment for testing purposes.

Sicheng-Pan commented 1 year ago

Update: Using flake with Cachix can give you instant development environment from the latest commit (remember to set yourself as trusted Nix user and accept the alternative cache signature). Building the cache with GitHub Actions takes ~20min for non-MacOS systems and ~30min for MacOS(x86) systems.

sorhawell commented 1 year ago

I must admit I'm curious to see some more of nix in action. I guess it is realistic alternative to rocker or r-lib infrastructure in some future use case scenario of rpolars.

Sicheng-Pan commented 1 year ago

Some reminders if you would like to merge this request:

sorhawell commented 1 year ago

@eitsupi refactored away curdir in #58 which will merge in some hours. Then if I can ask you @Sicheng-Pan to merge in updates in this PR a last time , and then we merge :)

Sicheng-Pan commented 1 year ago

@sorhawell I've merged the updates, and they seem to be correct.

sorhawell commented 1 year ago

The R ubuntu devel Pak error is not caused by this PR, also happend in main #60 .

I will disable R devel for now and raise an issue

sorhawell commented 1 year ago

Many thx @Sicheng-Pan :)