hubverse-org / hubData

Tools for accessing and working with hubverse Hub data
https://hubverse-org.github.io/hubData/
Other
3 stars 4 forks source link

issue installing hubData package #24

Closed M-7th closed 6 months ago

M-7th commented 6 months ago

Hi, last night, on the European RespiCast platform, we encountered the following issue: the GitHub workflows that calculate ensemble and scoring, which have always worked correctly, failed reporting an error during the installation phase of the hubData package. Upon analyzing logs, we found out that the problem relates to the arrow package configuration.

ERROR: configuration failed for package ‘arrow’
ERROR: dependency ‘arrow’ is not available for package ‘hubData’
* removing ‘/home/runner/work/_temp/Library/hubData’

As a workaround to solve the issue we added a step in the WFs to install the arrow package from the Apache R Universe repository, as suggested by a tip in the hubData installation page, but I'm not sure if it's the correct solution.

Thanks for your support. Best, Paolo

annakrystalli commented 6 months ago

Could you provide more details about the workflow?

There is a known issue with the arrow binaries on CRAN for macOS which is why hubData and hubValidations are temporarily using the development version of arrow installed from GitHub until arrow v16 is available on CRAN. This seems to be working fine for our R CMD CHECK workflows on all OSs as well as the validation workflows on Ubuntu in other hubs.

Does your workflow print session_info and if so which version of arrow was being installed and from where?

annakrystalli commented 6 months ago

One thing you could try is if your workflow uses caching, try invalidating it so all dependencies are reinstalled again?

M-7th commented 6 months ago

The workflow runs on an Ubuntu-latest and the steps to setup R and install packages are as follows:

- uses: r-lib/actions/setup-r@v2
      with:
        install-r: false
        use-public-rspm: true

    - name: Installing dependencies
      run: |
        install.packages("remotes")
        install.packages("arrow", repos = c("https://apache.r-universe.dev", "https://cran.r-project.org"))
        remotes::install_github("Infectious-Disease-Modeling-Hubs/hubData")
        remotes::install_github("Infectious-Disease-Modeling-Hubs/hubEnsembles")
        install.packages("dplyr")
        install.packages("jsonlite")
        install.packages("optparse")
        install.packages("purrr")
      shell: Rscript {0}

Arrow version is 15.0.2: Downloading GitHub repo apache/arrow@apache-arrow-15.0.2

M-7th commented 6 months ago

The error trace is

Installing package into ‘/home/runner/work/_temp/Library’
[373](https://github.com/european-modelling-hubs/flu-forecast-hub/actions/runs/8550812619/job/23428516688#step:5:374)
(as ‘lib’ is unspecified)
[374](https://github.com/european-modelling-hubs/flu-forecast-hub/actions/runs/8550812619/job/23428516688#step:5:375)
* installing *source* package ‘arrow’ ...
[375](https://github.com/european-modelling-hubs/flu-forecast-hub/actions/runs/8550812619/job/23428516688#step:5:376)
** using staged installation
[376](https://github.com/european-modelling-hubs/flu-forecast-hub/actions/runs/8550812619/job/23428516688#step:5:377)
*** pkg-config found.
[377](https://github.com/european-modelling-hubs/flu-forecast-hub/actions/runs/8550812619/job/23428516688#step:5:378)
*** libcurl not found
[378](https://github.com/european-modelling-hubs/flu-forecast-hub/actions/runs/8550812619/job/23428516688#step:5:379)
*** Proceeding without libarrow (no local source)
[379](https://github.com/european-modelling-hubs/flu-forecast-hub/actions/runs/8550812619/job/23428516688#step:5:380)
------------------------- NOTE ---------------------------
[380](https://github.com/european-modelling-hubs/flu-forecast-hub/actions/runs/8550812619/job/23428516688#step:5:381)
There was an issue building the Arrow C++ libraries.
[381](https://github.com/european-modelling-hubs/flu-forecast-hub/actions/runs/8550812619/job/23428516688#step:5:382)
See https://arrow.apache.org/docs/r/articles/install.html
[382](https://github.com/european-modelling-hubs/flu-forecast-hub/actions/runs/8550812619/job/23428516688#step:5:383)
---------------------------------------------------------
[383](https://github.com/european-modelling-hubs/flu-forecast-hub/actions/runs/8550812619/job/23428516688#step:5:384)
ERROR: configuration failed for package ‘arrow’
[384](https://github.com/european-modelling-hubs/flu-forecast-hub/actions/runs/8550812619/job/23428516688#step:5:385)
* removing ‘/home/runner/work/_temp/Library/arrow’
annakrystalli commented 6 months ago

It looks like you are missing System Requirement libcurl. On Linux the following libraries need to be installed if the package is being installed from source and you'd need to add a step to install these manually (or choose some option from the arrow docs on installing on Linux.

The installation of these system requirements are handled automatically in the setup-r-dependencies step in R CMD CHECK workflows which is why our workflows aren't breaking.

Given we will revert to using arrow binaries again when v16 is released I suggest use the binaries from r universe for now and then you can revert back to the original workflow when v16 is released too.

M-7th commented 6 months ago

Okay, we will proceed as you suggest. Thanks for your kind help