[R] Installation from source without system `libarrow` fails due to linkage error #43545

Open david-cortes opened 1 month ago

david-cortes commented 1 month ago

Describe the bug, including details regarding any error messages, version, and platform.

Installing the R package arrow from source without a core libarrow in the system fails on Debian 12, due to a linkage error.

The issue was reported previously: https://github.com/apache/arrow/issues/43337

And the comments in the thread mentioned that it was due to a bug in a specific version of pkgconf, but the issue still occurs in a system that doesn't use said version of the software where the bug was reported.

The problems happens regardless of whether:

Compiler info:

OS info:

R info:

To reproduce:


To reproduce:



assignUser commented 1 month ago
-- Found thrift: /home/david/anaconda3
-- Providing CMake module for FindThriftAlt as part of Parquet CMake package
-- pkg-config package for thrift that is used by parquet for static link isn't found

CMake found conda thrift which likely only provides dynamic libraries and also does not provide a pkg-config file. This means it will not be linked correctly in the R package. AFAIK there is nothing we can really do to support this scenario (cc @kou) other than prevent the libarrow build from picking up any system dependency that only comes as shared, which is also not trivial see #43353.

You can work around this by:

david-cortes commented 1 month ago

Just to clarify: this is a system R installation (through the Debian package manager) outside of conda, in a setup where the user has a conda installation (which doesn't manage R) with its paths added to ${PATH}.

I would venture to guess that it is a very typical setup for desktop users to have a system R installation alongside with a conda installation managing Python kernels, with the conda paths always in ${PATH} (e.g. in order to use reticulate in R).

And also to clarify: there aren't arrow binaries for many linux platforms, so e.g. setting LIBARROW_BINARY=true would still make install.packages compile libarrow from source on Debian and some RH-derived distributions. Getting it to install correctly requires quite a bit of tinkering with environment variables (e.g. removing conda paths, or creating a libarrow install elsewhere and adding it to the config).

assignUser commented 1 month ago

By the way, thanks for the detailed report, it makes it much easier to investigate.

a very typical setup

I'd bet against that as conda is not intended to be a system package manager and the docs actually specifically recommends not adding it to the path: 'Anaconda recommends against adding Anaconda to the PATH manually.' But again, even if you do this, you can also just install libarrow-all via conda and the package should pick it up (if your pkg-conf path is set correctly).

compile libarrow from source on Debian and some RH-derived distributions

This is not true, as long as your are on x86 with libstdc++ with openssl and libcurl dev libraries (e.g. libssl-dev and libcurl4-openssl-dev for debian) installed our binaries will work, see docs. To avoid issue with CRAN you have to opt in to this by setting LIBARROW_BINARY=true or NOT_CRAN=true, the latter will be set by a number of packages like {pak} and {remotes} so you don't even have to do it manually in that case. As you can see below I installed with precompiled binary on debian bookworm, testing and almalinux with no problem.

Getting it to install correctly requires quite a bit of tinkering with environment variables

If you have a non-standard environment this can be the case yes, as mentioned in another issue we do spend a lot of time and effort on making it as easy as possible to install the package without forcing a system dependency of arrow but we are also not :mage:. And in addition there is conda, PPP and r-universe where you can get compiled binaries for linux too.

R version 4.3.0 (2023-04-21) -- "Already Tomorrow" assignUser commented 1 month ago

*** libcurl not found

As mentioned before you need libcurl and openssl installed, see my previous post for exact package names.