apache / arrow

Apache Arrow is the universal columnar format and multi-language toolbox for fast data interchange and in-memory analytics
https://arrow.apache.org/
Apache License 2.0
14.61k stars 3.55k forks source link

[R]: Build system sneaks in rpath which breaks loading: arrow.so: Library not loaded: @rpath/libarrow.1100.dylib #35045

Open barracuda156 opened 1 year ago

barracuda156 commented 1 year ago

Describe the bug, including details regarding any error messages, version, and platform.

I am fixing R packages for Macports, and while I have apparently fixed arrow itself across macOS versions, I got a problem with R-arrow. Does anyone know where this rpath sneaks in from?

/opt/local/bin/g++-mp-12 -std=gnu++17 -dynamiclib -Wl,-headerpad_max_install_names -undefined dynamic_lookup -single_module -multiply_defined suppress -L/opt/local/Library/Frameworks/R.framework/Resources/lib -Wl,-headerpad_max_install_names -Wl,-rpath,/opt/local/lib/libgcc -L/opt/local/lib -lMacportsLegacySupport -arch ppc -o arrow.so RTasks.o altrep.o array.o array_to_vector.o arraydata.o arrowExports.o bridge.o buffer.o chunkedarray.o compression.o compute-exec.o compute.o config.o csv.o dataset.o datatype.o expression.o extension-impl.o feather.o field.o filesystem.o io.o json.o memorypool.o message.o parquet.o r_to_arrow.o recordbatch.o recordbatchreader.o recordbatchwriter.o safe-call-into-r-impl.o scalar.o schema.o symbols.o table.o threadpool.o type_infer.o -L/opt/local/lib -larrow -F/opt/local/Library/Frameworks/R.framework/.. -framework R -Wl,-framework -Wl,CoreFoundation
installing to /opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_R_R-arrow/R-arrow/work/arrow/arrow.Rcheck/00LOCK-arrow/00new/arrow/libs
** R
** inst
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** testing if installed package can be loaded from temporary location
Error: package or namespace load failed for ‘arrow’ in dyn.load(file, DLLpath = DLLpath, ...):
 unable to load shared object '/opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_R_R-arrow/R-arrow/work/arrow/arrow.Rcheck/00LOCK-arrow/00new/arrow/libs/arrow.so':
  dlopen(/opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_R_R-arrow/R-arrow/work/arrow/arrow.Rcheck/00LOCK-arrow/00new/arrow/libs/arrow.so, 6): Library not loaded: @rpath/libarrow.1100.dylib
  Referenced from: /opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_R_R-arrow/R-arrow/work/arrow/arrow.Rcheck/00LOCK-arrow/00new/arrow/libs/arrow.so
  Reason: image not found
Error: loading failed

It is not hard to fix this for install (using install_name_tool and fixing paths manually), however I do not immediately see how to fix it inside running tests, but it is desirable to have tests working.

The lib indeed gets this:

svacchanda$ otool -L /Users/svacchanda/Desktop/arrow/arrow.Rcheck/00_pkg_src/arrow/src/arrow.so
/Users/svacchanda/Desktop/arrow/arrow.Rcheck/00_pkg_src/arrow/src/arrow.so:
        arrow.so (compatibility version 0.0.0, current version 0.0.0)
        /opt/local/lib/libMacportsLegacySupport.dylib (compatibility version 1.0.0, current version 1.0.99)
        @rpath/libarrow.1100.dylib (compatibility version 1100.0.0, current version 1100.0.0)
        /System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation (compatibility version 150.0.0, current version 511.1.0)
        /opt/local/lib/libgcc/libstdc++.6.dylib (compatibility version 7.0.0, current version 7.30.0)
        /opt/local/Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libR.dylib (compatibility version 4.2.0, current version 4.2.3)
        /opt/local/lib/libgcc/libgcc_s.1.1.dylib (compatibility version 1.0.0, current version 1.1.0)
        /usr/lib/libgcc_s.1.dylib (compatibility version 1.0.0, current version 1.0.0)
        /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 117.0.0)

What I need is to change this to absolute path, like for other dylibs. But during compilation: since a single command builds the package and runs tests for it, I cannot squeeze in patching via install_name_tool.

Component(s)

R

nealrichardson commented 1 year ago

How are you building the Arrow C++ library? Can you share your full R CMD INSTALL output?

There is a CMake option to use (or not) rpath, defaults to ON: https://github.com/apache/arrow/blob/main/cpp/cmake_modules/DefineOptions.cmake#L206-L207

Sounds like you may want that off? If you're not building Arrow C++ separately from the R package build, you can pass this in by setting the env var EXTRA_CMAKE_FLAGS="-DARROW_INSTALL_NAME_RPATH=OFF"

https://github.com/apache/arrow/blob/main/r/inst/build_arrow_static.sh#L89

barracuda156 commented 1 year ago

@nealrichardson Thank you for responding. We use an external apache-arrow (there is a port for it), it is desirable not to rebuild same stuff over again, especially heavy one.

CMake setting for arrow would define lib paths for arrow, but not for its R package, AFAIU. Forcing CMake not to use rpaths for arrow (outside R) may have undesirable effects, which I won’t be able to test.

Having that in mind, it appears that the optimal solution would be to deal with rpath on a level of R package. Something has to pull that in, but a quick search through the source files did not point to that. (Standard Macports flags borrowed from R are not to blame: they are always identical, but we don’t get every linked dylib failing to load.)

nealrichardson commented 1 year ago

Ok. https://arrow.apache.org/docs/r/articles/developers/setup.html#rpath-issues may also be relevant for you. If not, there are several other developer guides at https://arrow.apache.org/docs/r/articles/ that could help, and you can also wade into the configure script if you need more.

barracuda156 commented 1 year ago

@nealrichardson I have rebuilt arrow with advised settings (rpath etc.), now I got another load error with R-arrow:

** testing if installed package can be loaded from temporary location
Error: package or namespace load failed for ‘arrow’ in dyn.load(file, DLLpath = DLLpath, ...):
 unable to load shared object '/opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_R_R-arrow/R-arrow/work/destroot/opt/local/Library/Frameworks/R.framework/Versions/4.2/Resources/library/00LOCK-arrow/00new/arrow/libs/arrow.so':
  dlopen(/opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_R_R-arrow/R-arrow/work/destroot/opt/local/Library/Frameworks/R.framework/Versions/4.2/Resources/library/00LOCK-arrow/00new/arrow/libs/arrow.so, 6): Symbol not found: __ZNK5arrow8DataType18ComputeFingerprintB5cxx11Ev
  Referenced from: /opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_R_R-arrow/R-arrow/work/destroot/opt/local/Library/Frameworks/R.framework/Versions/4.2/Resources/library/00LOCK-arrow/00new/arrow/libs/arrow.so
  Expected in: dynamic lookup

Error: loading failed
Execution halted
barracuda156 commented 1 year ago

Perhaps -D_GLIBCXX_USE_CXX11_ABI=0 may be needed.

UPD. Bingo. Running tests now.

barracuda156 commented 1 year ago

@nealrichardson Well, things aren’t too bad, given it is 10.6 and PPC: https://github.com/apache/arrow/issues/35083 11 tests fails, mostly due to a single silly locale error. Hopefully can be fixed.

barracuda156 commented 1 year ago

@nealrichardson Oddly, on macOS 11/x86_64 another loading error occurs:

** testing if installed package can be loaded from temporary location
  sh: line 1: 53015 Segmentation fault: 11  R_TESTS= '/opt/local/Library/Frameworks/R.framework/Resources/bin/R' --no-save --no-restore --no-echo 2>&1 < '/opt/local/var/macports/build/_Users_runner_work_macports-ports_macports-ports_ports_R_R-arrow/R-arrow/work/.tmp/RtmpJr7boE/fileb7e56efbf75f'

   *** caught segfault ***
  address 0x1425d8, cause 'memory not mapped'

  Traceback:
   1: dyn.load(file, DLLpath = DLLpath, ...)
   2: library.dynam(lib, package, package.lib)
   3: loadNamespace(package, lib.loc)
   4: doTryCatch(return(expr), name, parentenv, handler)
   5: tryCatchOne(expr, names, parentenv, handlers[[1L]])
   6: tryCatchList(expr, classes, parentenv, handlers)
   7: tryCatch({    attr(package, "LibPath") <- which.lib.loc    ns <- loadNamespace(package, lib.loc)    env <- attachNamespace(ns, pos = pos, deps, exclude, include.only)}, error = function(e) {    P <- if (!is.null(cc <- conditionCall(e)))         paste(" in", deparse(cc)[1L])    else ""    msg <- gettextf("package or namespace load failed for %s%s:\n %s",         sQuote(package), P, conditionMessage(e))    if (logical.return && !quietly)         message(paste("Error:", msg), domain = NA)    else stop(msg, call. = FALSE, domain = NA)})
   8: library(pkg_name, lib.loc = lib, character.only = TRUE, logical.return = TRUE)
   9: withCallingHandlers(expr, packageStartupMessage = function(c) tryInvokeRestart("muffleMessage"))
  10: suppressPackageStartupMessages(library(pkg_name, lib.loc = lib,     character.only = TRUE, logical.return = TRUE))
  11: doTryCatch(return(expr), name, parentenv, handler)
  12: tryCatchOne(expr, names, parentenv, handlers[[1L]])
  13: tryCatchList(expr, classes, parentenv, handlers)
  14: tryCatch(expr, error = function(e) {    call <- conditionCall(e)    if (!is.null(call)) {        if (identical(call[[1L]], quote(doTryCatch)))             call <- sys.call(-4L)        dcall <- deparse(call, nlines = 1L)        prefix <- paste("Error in", dcall, ": ")        LONG <- 75L        sm <- strsplit(conditionMessage(e), "\n")[[1L]]        w <- 14L + nchar(dcall, type = "w") + nchar(sm[1L], type = "w")        if (is.na(w))             w <- 14L + nchar(dcall, type = "b") + nchar(sm[1L],                 type = "b")        if (w > LONG)             prefix <- paste0(prefix, "\n  ")    }    else prefix <- "Error : "    msg <- paste0(prefix, conditionMessage(e), "\n")    .Internal(seterrmessage(msg[1L]))    if (!silent && isTRUE(getOption("show.error.messages"))) {        cat(msg, file = outFile)        .Internal(printDeferredWarnings())    }    invisible(structure(msg, class = "try-error", condition = e))})
  15: try(suppressPackageStartupMessages(library(pkg_name, lib.loc = lib,     character.only = TRUE, logical.return = TRUE)))
  16: tools:::.test_load_package("arrow", "/opt/local/var/macports/build/_Users_runner_work_macports-ports_macports-ports_ports_R_R-arrow/R-arrow/work/destroot/opt/local/Library/Frameworks/R.framework/Versions/4.2/Resources/library/00LOCK-arrow/00new")
  An irrecoverable exception occurred. R is aborting now ...
  ERROR: loading failed

Complete log from buildbot: https://github.com/macports/macports-ports/actions/runs/4682210687/jobs/8295764619?pr=18244