pola-rs / r-polars

Polars R binding
https://pola-rs.github.io/r-polars/
Other
463 stars 36 forks source link

CRAN Release? #80

Closed eitsupi closed 3 months ago

eitsupi commented 1 year ago

The Rust version of Debian testing will soon be 1.65, which may allow us to build rpolars tuned to build in the R-universe. (Note that the macOS builder's Rust version may be older than Debian's. eitsupi/prqlr#94) https://tracker.debian.org/pkg/rustc https://qa.debian.org/excuses.php?experimental=1&package=rustc

@sorhawell Could you potentially do a CRAN release? Probably need to modify build scripts (like #29) and documentation (Of course I would like to contribute to that.).

sorhawell commented 1 year ago

Cool :)

For mac this issue will trigger a warning or error with CRAN check. I likely have to bundle some shared macOS tool or include as a requirement in DESCRIPTION.

Is it allowed on CRAN, not to release on all OS, e.g. skip mac in first round?

Regarding #29 Using CARGO_HOME is very useful to run CHECK in development without recompiling for 30 min. There is something about an environment variable called something like IS_CRAN right? I could disable CARGO_HOME for cran builds then. I failed to rediscover the exact envvar name in docs.

eitsupi commented 1 year ago

For mac this issue will trigger a warning or error with CRAN check. I likely have to bundle some shared macOS tool or include as a requirement in DESCRIPTION.

I'm sorry I don't understand the whole issue, but are you saying that due to the issue, rpolars can's update polars? (Currently, the command check is succeeding on the main branch, right?)

There is something about an environment variable called something like IS_CRAN right?

Yes, this must be NOT_CRAN. (See also extendr/rextendr#233) Note that this is an environment variable name initiated independently by the devtools package, not defined by CRAN.

sorhawell commented 1 year ago

r-lib/rcmdchek does not support to filter errors. Many projects get the same warning on large binary size, which CRAN is not enforcing anyways.

Best advice I could find in issue track is to ignore any warnings in R CMD Check to avoid builds flagged as failed.

I found the practice sub-optimal, as I would not notice my own newly introduced warnings. Currently I run this filter in workflow to catch all warnings and errors from R CMD Check, which were not already known.

Currently I ignore the binary size warning and the mac dependency warning/error.I don't know how to fix it yet. However, I have never heard of any mac installation except cran Check, where it was actually a problem.

sorhawell commented 1 year ago

@eitsupi Oh sorry I did not reply well to your question.

I'm` sorry I don't understand the whole issue, but are you saying that due to the issue, rpolars can's update polars?

We can update polars just fine. The Mac DLL warning just started at a certain commit when rust-polars introduced some new system calls. I have not solved the warning yet. I suspect that any Mac does support these system calls.

I could immediately try to submit a build to the cran windows test build system and see what we get back. I wonder if there are any limitations on build time.

eitsupi commented 1 year ago

Thanks for the reply. I don't have macOS and don't have enough knowledge to help solve that problem, but I think there is a possibility that someone else can contribute.

sorhawell commented 1 year ago

@eitsupi I believe win + mac cran check warnings/fails have disappeared since around R4.3.0 I guess we should be able to make a submission to CRAN parallel to R-universe now :)

eitsupi commented 1 year ago

Grad to hear that!

Debian's Rust version is still 1.63, so we will not be able to do a CRAN release any time soon.

I hope to have the Debian Rust version up by summer and we can get the tasks for the CRAN release done by then. I created a milestone https://github.com/pola-rs/r-polars/milestone/1.

sorhawell commented 1 year ago

Actually I think we can get away with using Required Rust version >=1.62

sorhawell commented 1 year ago

ohh now I remember that namespace thing that required >= 1.64

grantmcdermott commented 1 year ago

Out of interest, I just peeked at the Debian rustc logs and 1.64 was accepted into the unstable branch earlier this week. See News here: https://packages.qa.debian.org/r/rustc.html

Last time it only took a week for 1.63 to migrate to the testing branch (which is what CRAN uses) after acceptance on unstable. So hopefully not too long now...

eitsupi commented 1 year ago

Last time it only took a week for 1.63 to migrate to the testing branch (which is what CRAN uses) after acceptance on unstable. So hopefully not too long now...

Debian Rust has been completely stuck for the last six months, and it makes little sense to base the schedule on anything earlier than that.....

eitsupi commented 1 year ago

Migration to Rust 1.65 seems to be going well as far as the progress status is seen. At the earliest, we expect Debian testing to be updated to Rust 1.65 in 5 days.

@sorhawell Any thoughts on submitting to CRAN?

One major issue is that the main branch assumes nightly toolchains and cannot be submitted to CRAN, so I think we need to consider whether to continue to enable simd in the main branch. -> Resolved 🎉

sorhawell commented 1 year ago

@eitsupi For me #262 looks to be an easier and more elegant way for us to maintain the CRAN and Runiverse release channels :) awesome !

I have not seen any benchmarks yet of +/- simd, so maybe it does not matter. Given there is a considerable speed-boost, it would be a pity if users forget about it. I guess this can be compensated with mentioning in the README / docs, that the github binary also is a very easy way to install polars on most OS and is faster.

eitsupi commented 1 year ago

@sorhawell Yeah, the #262 way with a feature seems to work.

As you say, documentation updates etc. will be needed to follow up #262. Can we merge the #262 for now?

eitsupi commented 1 year ago

@sorhawell @Sicheng-Pan The changes made by #233 prevent builds except for nightly-toolchain. https://github.com/pola-rs/r-polars/actions/runs/5411224946/jobs/9833719506?pr=274

Can we revert this change for now? Or Could you fix that error?

eitsupi commented 1 year ago

I created a new issue about the build failure with stable Rust #276.

eitsupi commented 1 year ago

FYI, I sent the latest version to R-hub builder and gets the following notes.

* checking CRAN incoming feasibility ... [15s] NOTE
Maintainer: 'Soren Welling <sorhawell@gmail.com>'

New submission

Version contains large components (0.6.1.9000)

Possibly misspelled words in DESCRIPTION:
  Polars (2:8)
  polars (9:37)

Found the following (possibly) invalid URLs:
  URL: https://pola-rs.github.io/polars/py-polars/html/reference/expressions (moved to https://pola-rs.github.io/polars/py-polars/html/reference/expressions/)
    From: man/Expr_class.Rd
    Status: 200
    Message: OK

The Title field starts with the package name.
The Title field should be in title case. Current version is:
'Polars ported to R'
In title case that is:
'Polars Ported to R'

The Description field should not start with the package name,
  'This package' or similar.
eitsupi commented 1 year ago

Debian testing's rustc is now 1.66!

@sorhawell I am going to work on the Makevars modifications and document updates, is it possible for you to make a submission to CRAN within the next few days? https://github.com/pola-rs/r-polars/milestone/1

sorhawell commented 1 year ago

Debian testing's rustc is now 1.66!

@sorhawell I am going to work on the Makevars modifications and document updates, is it possible for you to make a submission to CRAN within the next few days? https://github.com/pola-rs/r-polars/milestone/1

Does Testing mean CRAN supports it now? Or does Testing mean in a few days 1.66 will be stable on Debian?

I can do a submission tomorrow e.g. ? :)

eitsupi commented 1 year ago

Does Testing mean CRAN supports it now? Or does Testing mean in a few days 1.66 will be stable on Debian?

Debian "testing" is one of Debian's release channels for rolling releases. The Debian used on CRAN is Debian testing. https://cran.r-project.org/web/checks/check_flavors.html

I can do a submission tomorrow e.g. ? :)

Wonderful! I will send some PRs to resolve the issues added to the milestone, I hope so you can consider submitting them to CRAN once they are resolved.

eitsupi commented 1 year ago

After installing cargo and cmake (I didn't know this was necessary) via apt on the r-base Docker container (debian:testing based), I confirmed that latest polars can be installed. 🎉

root@ebc30d914202:/# R

R version 4.3.1 (2023-06-16) -- "Beagle Scouts"
Copyright (C) 2023 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> remotes::install_github("https://github.com/pola-rs/r-polars")
Downloading GitHub repo pola-rs/r-polars@HEAD
Running `R CMD build`...
* checking for file ‘/tmp/RtmpAfMh2K/remotese5b622aef04/pola-rs-r-polars-c951b5b/DESCRIPTION’ ... OK
* preparing ‘polars’:
* checking DESCRIPTION meta-information ... OK
* cleaning src
* checking for LF line-endings in source and make files and shell scripts
* checking for empty or unneeded directories
* building ‘polars_0.6.1.9000.tar.gz’
Installing package into ‘/usr/local/lib/R/site-library’
(as ‘lib’ is unspecified)
* installing *source* package ‘polars’ ...
** using staged installation
** libs
using C compiler: ‘gcc (Debian 12.2.0-14) 12.2.0’
rm -Rf polars.so ./rust/target/release/libr_polars.a entrypoint.o
gcc -I"/usr/share/R/include" -DNDEBUG       -fpic  -g -O2 -ffile-prefix-map=/build/r-base-O3lIg6/r-base-4.3.1=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2  -c entrypoint.c -o entrypoint.o
if [ "" != "true" ]; then \
        export CARGO_HOME=/tmp/RtmpN6mFGI/R.INSTALLecf1975c4d/polars/src/.cargo; \
fi && \
export PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/root/.cargo/bin" && \
if [ "" == "true" ]; then \
        cargo build --lib --profile release --manifest-path="./rust/Cargo.toml" --all-features; \
else \
        cargo build --lib --profile release --manifest-path="./rust/Cargo.toml"; \
fi
    Updating git repository `https://github.com/rpolars/extendr`
    Updating crates.io index
    Updating git repository `https://github.com/pola-rs/polars.git`
    Updating git repository `https://github.com/jorgecarleitao/arrow2`
    Updating git submodule `https://github.com/apache/arrow-testing`
    Updating git submodule `https://github.com/apache/parquet-testing`
    Updating git repository `https://github.com/ritchie46/jsonpath`
  Downloaded adler v1.0.2
  Downloaded multiversion-macros v0.7.1
  Downloaded lexical-core v0.8.5
  Downloaded phf_generator v0.11.1
  Downloaded ryu v1.0.12
  Downloaded now v0.1.3
  Downloaded json-deserializer v0.4.4
  Downloaded async-stream-impl v0.3.3
  Downloaded futures-sink v0.3.25
  Downloaded phf_shared v0.11.1
  Downloaded rustversion v1.0.11
  Downloaded value-trait v0.5.1
  Downloaded target-features v0.1.3
  Downloaded ppv-lite86 v0.2.17
  Downloaded xxhash-rust v0.8.6
  Downloaded libflate_lz77 v1.1.0
  Downloaded zstd-safe v5.0.2+zstd.1.5.2
  Downloaded zstd-safe v6.0.3+zstd.1.5.2
  Downloaded socket2 v0.4.9
  Downloaded indexmap v1.9.2
  Downloaded unicode-width v0.1.10
  Downloaded smallvec v1.10.0
  Downloaded multiversion v0.7.1
  Downloaded crc-catalog v1.1.1
  Downloaded rle-decode-fast v1.0.3
  Downloaded planus v0.3.1
  Downloaded crossbeam-channel v0.5.6
  Downloaded thiserror v1.0.40
  Downloaded rand_core v0.6.4
  Downloaded sqlparser v0.30.0
  Downloaded state v0.5.3
  Downloaded simdutf8 v0.1.4
  Downloaded crossbeam-deque v0.8.2
  Downloaded zstd v0.11.2+zstd.1.5.2
  Downloaded rayon v1.6.1
  Downloaded version_check v0.9.4
  Downloaded simd-json v0.7.0
  Downloaded sysinfo v0.28.4
  Downloaded zstd v0.12.3+zstd.1.5.2
  Downloaded unicode-ident v1.0.6
  Downloaded signal-hook v0.3.14
  Downloaded itoap v1.0.1
  Downloaded hashbrown v0.13.2
  Downloaded parquet-format-safe v0.2.4
  Downloaded syn v2.0.18
  Downloaded serde v1.0.152
  Downloaded ndarray v0.15.6
  Downloaded serde_derive v1.0.152
  Downloaded regex-syntax v0.6.28
  Downloaded lz4-sys v1.9.4
  Downloaded mio v0.8.5
  Downloaded syn v1.0.107
  Downloaded futures-util v0.3.25
  Downloaded regex v1.7.1
  Downloaded jemalloc-sys v0.5.2+5.3.0-patched
  Downloaded parquet2 v0.17.1
  Downloaded hashbrown v0.12.3
  Downloaded brotli-decompressor v2.3.4
  Downloaded spin v0.9.4
  Downloaded rayon-core v1.10.1
  Downloaded num-traits v0.2.15
  Downloaded lexical-parse-float v0.8.5
  Downloaded chrono v0.4.23
  Downloaded chrono-tz v0.8.1
  Downloaded libc v0.2.139
  Downloaded strum_macros v0.24.3
  Downloaded strength_reduce v0.2.4
  Downloaded static_assertions v1.1.0
  Downloaded signal-hook-registry v1.4.0
  Downloaded rand_distr v0.4.3
  Downloaded rand v0.8.5
  Downloaded phf v0.11.1
  Downloaded crossbeam-epoch v0.9.13
  Downloaded pin-project v1.0.12
  Downloaded miniz_oxide v0.6.2
  Downloaded zstd-sys v2.0.5+zstd.1.5.2
  Downloaded tokio v1.28.0
  Downloaded fast-float v0.2.0
  Downloaded ethnum v1.3.2
  Downloaded libR-sys v0.5.0
  Downloaded strum v0.24.1
  Downloaded proc-macro2 v1.0.60
  Downloaded matrixmultiply v0.3.2
  Downloaded getrandom v0.2.8
  Downloaded paste v1.0.11
  Downloaded lexical-parse-integer v0.8.6
  Downloaded siphasher v0.3.10
  Downloaded serde_json v1.0.91
  Downloaded libz-ng-sys v1.1.8
  Downloaded brotli v3.3.4
  Downloaded nanorand v0.7.0
  Downloaded libflate v1.2.0
  Downloaded itoa v1.0.5
  Downloaded crossterm v0.25.0
  Downloaded crc32fast v1.3.2
  Downloaded lexical-util v0.8.5
  Downloaded float-cmp v0.9.0
  Downloaded aho-corasick v0.7.20
  Downloaded quote v1.0.28
  Downloaded memmap2 v0.5.8
  Downloaded lexical-write-integer v0.8.5
  Downloaded libm v0.2.6
  Downloaded lexical-write-float v0.8.5
  Downloaded lexical v6.1.1
  Downloaded fallible-streaming-iterator v0.1.9
  Downloaded either v1.8.0
  Downloaded streaming-iterator v0.1.9
  Downloaded parse-zoneinfo v0.3.0
  Downloaded iana-time-zone v0.1.53
  Downloaded futures-executor v0.3.25
  Downloaded futures-core v0.3.25
  Downloaded enum_dispatch v0.3.11
  Downloaded once_cell v1.17.0
  Downloaded bytes v1.3.0
  Downloaded base64 v0.21.0
  Downloaded phf_codegen v0.11.1
  Downloaded futures v0.3.25
  Downloaded flume v0.10.14
  Downloaded cc v1.0.78
  Downloaded bytemuck v1.13.0
  Downloaded memoffset v0.7.1
  Downloaded memchr v2.5.0
  Downloaded lz4 v1.24.0
  Downloaded indenter v0.3.3
  Downloaded halfbrown v0.1.18
  Downloaded foreign_vec v0.1.0
  Downloaded argminmax v0.6.1
  Downloaded ahash v0.8.2
  Downloaded futures-io v0.3.25
  Downloaded futures-channel v0.3.25
  Downloaded time v0.1.45
  Downloaded pin-project-lite v0.2.9
  Downloaded parking_lot v0.12.1
  Downloaded lazy_static v1.4.0
  Downloaded hash_hasher v2.0.3
  Downloaded glob v0.3.1
  Downloaded arrow-format v0.8.1
  Downloaded fxhash v0.2.1
  Downloaded flate2 v1.0.25
  Downloaded dyn-clone v1.0.10
  Downloaded crossbeam-utils v0.8.14
  Downloaded cmake v0.1.49
  Downloaded byteorder v1.4.3
  Downloaded bytemuck_derive v1.4.0
  Downloaded rand_chacha v0.3.1
  Downloaded parking_lot_core v0.9.6
  Downloaded futures-macro v0.3.25
  Downloaded semver v1.0.16
  Downloaded pin-project-internal v1.0.12
  Downloaded lock_api v0.4.9
  Downloaded bitflags v1.3.2
  Downloaded smartstring v1.0.1
  Downloaded pkg-config v0.3.26
  Downloaded snap v1.1.0
  Downloaded slab v0.4.7
  Downloaded seq-macro v0.3.2
  Downloaded pin-utils v0.1.0
  Downloaded log v0.4.17
  Downloaded crc v2.1.0
  Downloaded jobserver v0.1.25
  Downloaded signal-hook-mio v0.2.3
  Downloaded scopeguard v1.1.0
  Downloaded comfy-table v6.1.4
  Downloaded async-trait v0.1.62
  Downloaded array-init-cursor v0.2.0
  Downloaded thiserror-impl v1.0.40
  Downloaded streaming-decompression v0.1.2
  Downloaded rustc_version v0.4.0
  Downloaded rawpointer v0.2.1
  Downloaded hex v0.4.3
  Downloaded futures-task v0.3.25
  Downloaded chrono-tz-build v0.1.0
  Downloaded cfg-if v1.0.0
  Downloaded autocfg v1.1.0
  Downloaded atoi v2.0.0
  Downloaded async-stream v0.3.3
  Downloaded alloc-stdlib v0.2.2
  Downloaded adler32 v1.2.0
  Downloaded jemallocator v0.5.0
  Downloaded num_cpus v1.15.0
  Downloaded num-integer v0.1.45
  Downloaded heck v0.4.0
  Downloaded fs_extra v1.2.0
  Downloaded alloc-no-stdlib v2.0.4
  Downloaded dirs-sys v0.4.0
  Downloaded num-complex v0.4.3
  Downloaded avro-schema v0.3.0
  Downloaded dirs v5.0.0
  Downloaded 188 crates (16.2 MB) in 2.11s (largest was `libz-ng-sys` at 1.8 MB)
   Compiling libc v0.2.139
   Compiling autocfg v1.1.0
   Compiling proc-macro2 v1.0.60
   Compiling unicode-ident v1.0.6
   Compiling quote v1.0.28
   Compiling syn v1.0.107
   Compiling cfg-if v1.0.0
   Compiling serde_derive v1.0.152
   Compiling serde v1.0.152
   Compiling libm v0.2.6
   Compiling version_check v0.9.4
   Compiling scopeguard v1.1.0
   Compiling static_assertions v1.1.0
   Compiling crossbeam-utils v0.8.14
   Compiling futures-core v0.3.25
   Compiling pkg-config v0.3.26
   Compiling memchr v2.5.0
   Compiling siphasher v0.3.10
   Compiling lexical-util v0.8.5
   Compiling futures-channel v0.3.25
   Compiling phf_shared v0.11.1
   Compiling rand_core v0.6.4
   Compiling futures-sink v0.3.25
   Compiling once_cell v1.17.0
   Compiling futures-task v0.3.25
   Compiling num-traits v0.2.15
   Compiling memoffset v0.7.1
   Compiling crossbeam-epoch v0.9.13
   Compiling slab v0.4.7
   Compiling indexmap v1.9.2
   Compiling ahash v0.8.2
   Compiling rand v0.8.5
   Compiling num-integer v0.1.45
   Compiling hashbrown v0.12.3
   Compiling target-features v0.1.3
   Compiling futures-util v0.3.25
   Compiling crc32fast v1.3.2
   Compiling pin-project-lite v0.2.9
   Compiling either v1.8.0
   Compiling regex-syntax v0.6.28
   Compiling rayon-core v1.10.1
   Compiling jobserver v0.1.25
   Compiling crossbeam-deque v0.8.2
   Compiling cc v1.0.78
   Compiling phf_generator v0.11.1
   Compiling crossbeam-channel v0.5.6
   Compiling getrandom v0.2.8
   Compiling num_cpus v1.15.0
   Compiling semver v1.0.16
   Compiling pin-utils v0.1.0
   Compiling itoa v1.0.5
   Compiling log v0.4.17
   Compiling futures-io v0.3.25
   Compiling serde_json v1.0.91
   Compiling ryu v1.0.12
   Compiling phf_codegen v0.11.1
   Compiling cmake v0.1.49
   Compiling lexical-write-integer v0.8.5
   Compiling lexical-parse-integer v0.8.6
   Compiling phf v0.11.1
   Compiling alloc-no-stdlib v2.0.4
   Compiling zstd-safe v5.0.2+zstd.1.5.2
   Compiling async-trait v0.1.62
   Compiling snap v1.1.0
   Compiling lexical-parse-float v0.8.5
   Compiling alloc-stdlib v0.2.2
   Compiling lexical-write-float v0.8.5
   Compiling regex v1.7.1
   Compiling time v0.1.45
   Compiling lock_api v0.4.9
   Compiling adler v1.0.2
   Compiling rayon v1.6.1
   Compiling zstd-safe v6.0.3+zstd.1.5.2
   Compiling rle-decode-fast v1.0.3
   Compiling fallible-streaming-iterator v0.1.9
   Compiling iana-time-zone v0.1.53
   Compiling lexical-core v0.8.5
   Compiling libflate_lz77 v1.1.0
   Compiling rustc_version v0.4.0
   Compiling zstd-sys v2.0.5+zstd.1.5.2
   Compiling libz-ng-sys v1.1.8
   Compiling lz4-sys v1.9.4
   Compiling miniz_oxide v0.6.2
   Compiling parse-zoneinfo v0.3.0
   Compiling brotli-decompressor v2.3.4
   Compiling adler32 v1.2.0
   Compiling crc-catalog v1.1.1
   Compiling array-init-cursor v0.2.0
   Compiling libflate v1.2.0
   Compiling planus v0.3.1
   Compiling crc v2.1.0
   Compiling arrow2 v0.17.0 (https://github.com/jorgecarleitao/arrow2?rev=1491c6e8f4fd100f53c358e4f3ef1536d9e75090#1491c6e8)
   Compiling chrono-tz-build v0.1.0
   Compiling streaming-decompression v0.1.2
   Compiling syn v2.0.18
   Compiling aho-corasick v0.7.20
   Compiling parking_lot_core v0.9.6
   Compiling signal-hook v0.3.14
   Compiling thiserror v1.0.40
   Compiling simdutf8 v0.1.4
   Compiling seq-macro v0.3.2
   Compiling mio v0.8.5
   Compiling signal-hook-registry v1.4.0
   Compiling brotli v3.3.4
   Compiling chrono-tz v0.8.1
   Compiling smartstring v1.0.1
   Compiling ethnum v1.3.2
   Compiling rustversion v1.0.11
   Compiling base64 v0.21.0
   Compiling hash_hasher v2.0.3
   Compiling dyn-clone v1.0.10
   Compiling streaming-iterator v0.1.9
   Compiling smallvec v1.10.0
   Compiling hashbrown v0.13.2
   Compiling strength_reduce v0.2.4
   Compiling foreign_vec v0.1.0
   Compiling ppv-lite86 v0.2.17
   Compiling parking_lot v0.12.1
   Compiling signal-hook-mio v0.2.3
   Compiling sysinfo v0.28.4
   Compiling heck v0.4.0
   Compiling rawpointer v0.2.1
   Compiling rand_chacha v0.3.1
   Compiling bitflags v1.3.2
   Compiling crossterm v0.25.0
   Compiling matrixmultiply v0.3.2
   Compiling num-complex v0.4.3
   Compiling strum v0.24.1
   Compiling unicode-width v0.1.10
   Compiling byteorder v1.4.3
   Compiling itoap v1.0.1
   Compiling xxhash-rust v0.8.6
   Compiling thiserror-impl v1.0.40
   Compiling float-cmp v0.9.0
   Compiling fxhash v0.2.1
   Compiling argminmax v0.6.1
   Compiling tokio v1.28.0
   Compiling hex v0.4.3
   Compiling ndarray v0.15.6
   Compiling rand_distr v0.4.3
   Compiling atoi v2.0.0
   Compiling dirs-sys v0.4.0
   Compiling socket2 v0.4.9
   Compiling dirs v5.0.0
   Compiling lexical v6.1.1
   Compiling memmap2 v0.5.8
   Compiling libR-sys v0.5.0
   Compiling bytes v1.3.0
   Compiling fast-float v0.2.0
   Compiling fs_extra v1.2.0
   Compiling paste v1.0.11
   Compiling glob v0.3.1
   Compiling extendr-engine v0.4.0 (https://github.com/rpolars/extendr?branch=pr473_553_555_566#e8c03bae)
   Compiling futures-macro v0.3.25
   Compiling async-stream-impl v0.3.3
   Compiling bytemuck_derive v1.4.0
   Compiling multiversion-macros v0.7.1
   Compiling async-stream v0.3.3
   Compiling strum_macros v0.24.3
   Compiling enum_dispatch v0.3.11
   Compiling bytemuck v1.13.0
   Compiling jemalloc-sys v0.5.2+5.3.0-patched
   Compiling pin-project-internal v1.0.12
   Compiling sqlparser v0.30.0
   Compiling multiversion v0.7.1
   Compiling polars v0.28.0 (https://github.com/pola-rs/polars.git?rev=e973f6386a28f16136fb8ba5a737103f95911861#e973f638)
   Compiling extendr-api v0.4.0 (https://github.com/rpolars/extendr?branch=pr473_553_555_566#e8c03bae)
   Compiling extendr-macros v0.4.0 (https://github.com/rpolars/extendr?branch=pr473_553_555_566#e8c03bae)
   Compiling spin v0.9.4
   Compiling nanorand v0.7.0
   Compiling lazy_static v1.4.0
   Compiling indenter v0.3.3
   Compiling state v0.5.3
   Compiling comfy-table v6.1.4
   Compiling pin-project v1.0.12
   Compiling flume v0.10.14
   Compiling flate2 v1.0.25
   Compiling futures-executor v0.3.25
   Compiling futures v0.3.25
   Compiling parquet-format-safe v0.2.4
   Compiling lz4 v1.24.0
   Compiling chrono v0.4.23
   Compiling arrow-format v0.8.1
   Compiling json-deserializer v0.4.4
   Compiling polars-utils v0.28.0 (https://github.com/pola-rs/polars.git?rev=e973f6386a28f16136fb8ba5a737103f95911861#e973f638)
   Compiling halfbrown v0.1.18
   Compiling avro-schema v0.3.0
   Compiling value-trait v0.5.1
   Compiling now v0.1.3
   Compiling simd-json v0.7.0
   Compiling jsonpath_lib v0.3.0 (https://github.com/ritchie46/jsonpath?branch=improve_compiled#24eaf0b4)
   Compiling zstd v0.11.2+zstd.1.5.2
   Compiling zstd v0.12.3+zstd.1.5.2
   Compiling parquet2 v0.17.1
   Compiling jemallocator v0.5.0
   Compiling polars-error v0.28.0 (https://github.com/pola-rs/polars.git?rev=e973f6386a28f16136fb8ba5a737103f95911861#e973f638)
   Compiling polars-arrow v0.28.0 (https://github.com/pola-rs/polars.git?rev=e973f6386a28f16136fb8ba5a737103f95911861#e973f638)
   Compiling polars-row v0.28.0 (https://github.com/pola-rs/polars.git?rev=e973f6386a28f16136fb8ba5a737103f95911861#e973f638)
   Compiling polars-core v0.28.0 (https://github.com/pola-rs/polars.git?rev=e973f6386a28f16136fb8ba5a737103f95911861#e973f638)
   Compiling polars-ops v0.28.0 (https://github.com/pola-rs/polars.git?rev=e973f6386a28f16136fb8ba5a737103f95911861#e973f638)
   Compiling polars-time v0.28.0 (https://github.com/pola-rs/polars.git?rev=e973f6386a28f16136fb8ba5a737103f95911861#e973f638)
   Compiling polars-io v0.28.0 (https://github.com/pola-rs/polars.git?rev=e973f6386a28f16136fb8ba5a737103f95911861#e973f638)
   Compiling polars-plan v0.28.0 (https://github.com/pola-rs/polars.git?rev=e973f6386a28f16136fb8ba5a737103f95911861#e973f638)
   Compiling polars-pipe v0.28.0 (https://github.com/pola-rs/polars.git?rev=e973f6386a28f16136fb8ba5a737103f95911861#e973f638)
   Compiling polars-lazy v0.28.0 (https://github.com/pola-rs/polars.git?rev=e973f6386a28f16136fb8ba5a737103f95911861#e973f638)
   Compiling polars-sql v0.28.0 (https://github.com/pola-rs/polars.git?rev=e973f6386a28f16136fb8ba5a737103f95911861#e973f638)
   Compiling r-polars v0.1.0 (/tmp/RtmpN6mFGI/R.INSTALLecf1975c4d/polars/src/rust)
    Finished release [optimized] target(s) in 11m 23s
if [ "" != "true" ]; then \
        rm -Rf /tmp/RtmpN6mFGI/R.INSTALLecf1975c4d/polars/src/.cargo && \
        rm -Rf ./rust/target/release/build; \
fi
if [ -f "./rust/target/release/libr_polars.a" ]; then \
        echo "file is there: "; \
else \
        echo "no, file is NOT there: "; \
        mkdir -p ./rust/target/release ; \
        echo "trying to symlink in "./rust/target/release/libr_polars.a""; \
        ln -s "./rust/target/release/libr_polars.a" ./rust/target/release/libr_polars.a ; \
fi
file is there:
if [ "" == "true" ]; then \
        echo "cleanup!!" ; \
        mv ./rust/target/release/libr_polars.a ./rust/target/release/../temp_binary.a; \
        rm -rf ./rust/target/release; \
        mkdir ./rust/target/release; \
        mv ./rust/target/release/../temp_binary.a ./rust/target/release/libr_polars.a; \
        rm -rf ./src/.cargo; \
else \
        echo "hands off!!" ; \
fi
hands off!!
gcc -shared -L/usr/lib/R/lib -Wl,-z,relro -o polars.so entrypoint.o -L./rust/target/release -lr_polars -L/usr/lib/R/lib -lR
installing to /usr/local/lib/R/site-library/00LOCK-polars/00new/polars/libs
** R
** inst
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** installing vignettes
** testing if installed package can be loaded from temporary location
** checking absolute paths in shared objects and dynamic libraries
** testing if installed package can be loaded from final location
** testing if installed package keeps a record of temporary installation path
* DONE (polars)
> packageVersion("polars")
[1] ‘0.6.1.9000’
> q()
Save workspace image? [y/n/c]: n
root@ebc30d914202:/# ls -la ~
total 16
drwx------ 2 root root 4096 Jun 12 00:00 .
drwxr-xr-x 1 root root 4096 Jul  3 12:53 ..
-rw-r--r-- 1 root root  571 Apr 10  2021 .bashrc
-rw-r--r-- 1 root root  161 Jul  9  2019 .profile
eitsupi commented 1 year ago

I am assuming that once #304 is merged we can send the package to CRAN....

It might be better to resolve #230 before releasing, but this seemed like a pretty big change and one that would be difficult for me to fix right away.

sorhawell commented 1 year ago

@eitsupi I was kidnapped into a meeting+follow 6 hours ago. What should I start on now towards cran release?

eitsupi commented 1 year ago

I was kidnapped into a meeting+follow 6 hours ago.

👍

What should I start on now towards cran release?

Perhaps we should create a branch like 0.7.0rc and work there?

My understanding is that the following steps need. (like eitsupi/prqlr#126)

  1. Increment the version number to 0.7.0 (by usethis::use_version()) and commit.
  2. devtools::check_rhub() (It will take a few hours.)
  3. Add cran-comments.md and commit.
  4. Send the package by devtools::release() and then add CRAN-SUBMISSION and commit.
  5. After releasing, add and push the tag.
  6. Increment the version number to X.Y.Z.9000 and commit.
  7. Merge to main

It is important to note that the final decision on which commit to release is made by CRAN, so strictly speaking, we cannot push tags on the main branch.

Since @etiennebacher and @grantmcdermott obviously know more about submitting packages to CRAN than I do, I would appreciate it if you could tell me about good materials.

sorhawell commented 1 year ago

I will try that now

sorhawell commented 1 year ago

@eitsupi given you agree to #307 how would you like this to look like

Authors@R:
  c(person("Ritchie", "Vink", , "ritchie46@gmail.com", role = c("aut")),
    person("Soren", "Welling", , "sorhawell@gmail.com", role = c("aut","cre")))
sorhawell commented 1 year ago

committed branch 0.7.0rc to check_rhub will continue tomorrow

eitsupi commented 1 year ago

Thanks for working that!

If my name and contact information are to be included in the DESCRIPTION, please list them as like this: https://github.com/eitsupi/prqlr/blob/bca43ca2cd17f715f57440da7704d47c4f1ff9db/DESCRIPTION#L9

By the way, I forgot that the README needs to include the installation instructions from CRAN. Push directly to the 0.7.0rc branch (and open the PR to track).

eitsupi commented 1 year ago

Opened a PR for tracking changes. #308

sorhawell commented 1 year ago

a todo list before I finialize submission:

sorhawell commented 1 year ago

rhub_check - error 1

eitsupi commented 1 year ago

Oh, R-hub Windows does not have Rust installed....... (r-hub/rhub#550) It is installed on Linux. (although it timed out the other day when I tried it)

etiennebacher commented 1 year ago

Since @etiennebacher and @grantmcdermott obviously know more about submitting packages to CRAN than I do, I would appreciate it if you could tell me about good materials.

devtools::release() already comes with a a nice list of things to do before submitting. There's also the collaborative list I linked in #297.

sorhawell commented 1 year ago

I updated DESCRIPTION to adhere to list @etiennebacher

rhub_check does not lead anywhere

@eitsupi should the cran-comments.md just be an empty file in root?

ASFAIK 0.7.0rc is ready to be submitted?

eitsupi commented 1 year ago

The cran-comments.md must mention that this is the first release. https://r-pkgs.org/release.html

For example, I have written the following, but looking at other newly submitted packages (https://nx10.github.io/cransubs/), it seems that something simpler would be fine. https://github.com/eitsupi/prqlr/pull/45/commits/67914d30381f81b92a482afe8f45634a23b16659

ASFAIK 0.7.0rc is ready to be submitted?

I am currently working on making sure there are no warnings...... Please wait a moment.

eitsupi commented 1 year ago

As I commented on #308, I believe we are ready to submit.

If the R-hub builder could have done the check, we would not have been surprised by the results when NOT_CRAN="true" is set.......

grantmcdermott commented 1 year ago

Hey folks, sorry I haven't weighed in here yet. Public holiday here yesterday and I'm frantically wrapping up work before boarding a transatlantic flight this afternoon. I can take a proper look on the plane. But will echo Etienne's list and/or the devtools::release checklist if you're new to CRAN submissions. Those are good, easy to follow references... And sounds like everything might already be in order.

eitsupi commented 1 year ago

As noted in https://github.com/pola-rs/r-polars/pull/308#issuecomment-1641130045, CRAN release seems to be difficult at this time. (I don't know of any easy solution to overcome the build time limitation.)


This was seen using 1500% CPU on the Fedora check system during installation. As in

  • checking whether package ‘polars’ can be installed ... [6192s/525s] ERROR

which has also exceeded the allowed CPU time limit on that system. That is a major violation of the CRAN policy, so the package will be removed from CRAN.

Another is misrepresentation of authorship -- you have ignored the policy's (’All components’ includes any downloaded at installation or during use.) and AFAICS not listed the authors and copyright holders of those you download (and certainly not said who is responsible for those).


I finally found this. cargo build help says

Miscellaneous Options
    -j N, --jobs N
        Number of parallel jobs to run. May also be specified with the
        [build.jobs](http://build.jobs/) config value
        <https://doc.rust-lang.org/cargo/reference/config.html>.

Defaults to the number of logical CPUs.

So another thing for the 'Using Rust' document.

sorhawell commented 1 year ago

6192s seems very slow especially if it was multi-core. My older 2015 computer uses ~1200 secs on 4 physical cores. If 525 secs is the acceptable limit, that could be a bit hard to achieve also. How do we test on their fedora machine?

Maybe then we´re back to have to ask for special permission to install from external binary.

etiennebacher commented 1 year ago

Hi all, sorry I wasn't available in the last few days but I'm not sure I could have helped in any case. If the CRAN release is not gonna happen soon then I agree with @eitsupi about making a github release anyway (cf #308)

How do we test on their fedora machine?

Maybe with rhub::check_on_fedora()?

etiennebacher commented 1 year ago

Also, there's now a 10MB limit for package tarballs: https://github.com/eddelbuettel/crp/commit/6dcb0ec7064e3f73461f5538365f48d061f7fa9c

Is this another concern for us?

sorhawell commented 1 year ago

When submitting to cran I think, I saw a message stating the tarball was 500kB or so.

sorhawell commented 1 year ago

Would it be possible to just state, that polars is not intended for the "Fedora" distibution?

eitsupi commented 1 year ago

As I understand it, this has nothing to do with Fedora. It simply means that the test was done on Fedora. I would not be surprised if the same problem occurs on Debian, macOS or Windows.

eitsupi commented 1 year ago

Also, there's now a 10MB limit for package tarballs: eddelbuettel/crp@6dcb0ec

Is this another concern for us?

The capacity of the source would basically increase with the inclusion of images and pdf files, I think. The arrow package, for example, has given up including vignettes in the package to accommodate capacity limitations and keeps everything on the website only.

If we include all dependent crates in the source in the future, capacity limits could become an issue.

david-cortes commented 1 year ago

As noted in #308 (comment), CRAN release seems to be difficult at this time. (I don't know of any easy solution to overcome the build time limitation.)

This was seen using 1500% CPU on the Fedora check system during installation. As in

  • checking whether package ‘polars’ can be installed ... [6192s/525s] ERROR

which has also exceeded the allowed CPU time limit on that system. That is a major violation of the CRAN policy, so the package will be removed from CRAN. Another is misrepresentation of authorship -- you have ignored the policy's (’All components’ includes any downloaded at installation or during use.) and AFAICS not listed the authors and copyright holders of those you download (and certainly not said who is responsible for those).

I finally found this. cargo build help says

Miscellaneous Options
    -j N, --jobs N
        Number of parallel jobs to run. May also be specified with the
        [build.jobs](http://build.jobs/) config value
        <https://doc.rust-lang.org/cargo/reference/config.html>.

Defaults to the number of logical CPUs. So another thing for the 'Using Rust' document.

Not an expert on the topic, but I think the issue you are seeing is because the compilation process is using more than 1 thread (which it shouldn't), not because of the times - you can see that heavier packages with very long times builds have made it to CRAN before, such as MLPack or DuckDB (note that build times for e.g. the ASAN-enabled versions are much, much longer than what's shown on those pages).

Also don't know anything about cargo/rust, but downloading files during package building doesn't sound like something CRAN would like.

sorhawell commented 1 year ago

@david-cortes Thank you for suggesting this interpretation. You could very well be right on that.

A fast list of what I think we need to address in a new submission:

Currently it seems CRAN is trying out new rules for packages including rust code. I know other maintainers of more smaller rust-R packages are scrambling to not get archived. I suspect CRAN has not made the last update of these rules, as the text leaves many open questions to me.

IMO CRAN focus is all on long term academic reproducibility and not on supporting new features or having a developer friendly repository. R-Universe is pretty fine to me. Currently to also get SIMD optimization, which requires rust nightly, only the github release binary will do.

I would personally prefer to not be the maintainer of the polars package on CRAN. I support anyone who would take on that role.

Who knows maybe the goal is closer than expected :)

david-cortes commented 1 year ago

@david-cortes Thank you for suggesting this interpretation. You could very well be right on that.

A fast list of what I think we need to address in a new submission:

  • limit n cpus
  • use cargo vendor to bundle all ~150 crates
  • Somehow keep the resulting tarball size less than 10Mb
  • use scripts to auto-generate full authors and licence list of all 150 crates
  • publish all github dependencies as crates on crates.io or bundle that code also

Currently it seems CRAN is trying out new rules for packages including rust code. I know other maintainers of more smaller rust-R packages are scrambling to not get archived. I suspect CRAN has not made the last update of these rules, as the text leaves many open questions to me.

IMO CRAN focus is all on long term academic reproducibility and not on supporting new features or having a developer friendly repository. R-Universe is pretty fine to me. Currently to also get SIMD optimization, which requires rust nightly, only the github release binary will do.

I would personally prefer to not be the maintainer of the polars package on CRAN. I support anyone who would take on that role.

Who knows maybe the goal is closer than expected :)

Another potential route is to have a system dependency on polars as a shared library, and submit only the R wrapper to CRAN (not sure if that's how this library works though). Software bundles with complicated build dependencies tend follow that approach instead (e.g. cbc, rgdal), but it's not as easy to install the library as a user then.

sorhawell commented 1 year ago

@david-cortes interesting idea. I see how that could help skipping CRAN compilation issues. I took a look at rgdal it appears to be C-code wrapping the external requirement gdal via a extern "C" ABI (I pressume).

In the case of r-polars, py-polars, it is rust code wrapping the rust-polars API and compiling into one optimized compilation unit and one binary. I have not seen yet polars as an external-binary, which could be called via some extern "C" ABI. Anyways then this project would need to be refactored a whole lot into C or have a minimal bit in rust and we would need to publish an external polars-for-R binary via some other channel.

One of aim of this package is to have no hard dependencies like data.table. Therefore installing package from binary takes ~15 sec and just works.

I personally wanted to learn rust and support R with some better tools for ML preprocessing to keep up with other languages. But ML in production is just not much in the scope of CRAN, I guess. I'm personally not very excited when thinking about CRAN rules. If I still were in academia and wanted to publish some classic statistical work in some R related journal, I would likely use Rcpp and/or plain R, and think CRAN was a perfect match.

mrwunderbar666 commented 1 year ago

Hi,

I have a vested interest in making rpolars available via CRAN. And I know the process is horrible. I could help with preparing the package so the Linux builds will comply to their rules. I've been working on adjusting the build scripts to do most tasks automatically. (See commit in my fork below)

Looking at your todo list:

limit n cpus

I've added checks in Makevars that infer whether it is running on CRAN and then act accordingly

use cargo vendor to bundle all ~150 crates

There is an example for that here: https://github.com/yutannihilation/string2path I added a package-cran command to the makefile which takes care of the vendoring process and outputs a tar that should be suitable for CRAN submissions.

Somehow keep the resulting tarball size less than 10Mb

This is impossible, and the CRAN maintainers should acknowledge that and they list it also in their policies:

Source package tarballs should if possible not exceed 10MB. It is much preferred that third-party source software should be included within the package (as e.g. a vendor.tar.xz file) than be downloaded at installation: if this requires a larger tarball a modestly increased limit can be requested at submission.

Currently, the resulting tar is about 34 MB.

use scripts to auto-generate full authors and licence list of all 150 crates

I took the script by @yutannihilation update_authors.R which takes care of this process.

publish all github dependencies as crates on crates.io or bundle that code also

I think this won't be necessary, because cargo vendor includes all files for offiline compilation