nflverse / nflverse-data

Automated nflverse data repository
https://www.nflverse.com
Creative Commons Attribution 4.0 International
197 stars 18 forks source link

Preparation for possible {arrow} removal on CRAN #40

Closed mrcaseb closed 9 months ago

mrcaseb commented 9 months ago

It is highly likely that the R package {arrow} will be (temporarily) archived on CRAN.

This has impact on all nflverse releases that include .parquet files as the corresponding workflows install arrow through CRAN.

The least invasive solution is to add the arrow r-universe repo to the list of repos when setting up R on the runner to allow the installation of a binary. We can find the suggested repo here. The yaml should look like this

 - uses: r-lib/actions/setup-r@v2
   with:
      extra-repositories: 'https://apache.r-universe.dev/'

Affected tags in nflverse-data

I am not 100% sure if there are releases in other repos that require arrow to save parquet files. In nflverse-data, we can list all affected tags using this code

all_assets <- piggyback::pb_list("nflverse/nflverse-data")

all_assets |> 
  dplyr::filter(
    stringr::str_detect(file_name, ".parquet")
  ) |> 
  dplyr::distinct(tag)

which outputs the following list

mrcaseb commented 9 months ago

For future reference: I checked docs and can confirm that the r-universe repo hosts linux binaries for r-release on ubuntu:latest which is perfectly fine for our gh workflows

https://github.com/r-universe-org/help?tab=readme-ov-file#does-r-universe-have-linux-binaries

mrcaseb commented 9 months ago

Arrow has been updated on CRAN so we close this (at least for now).