Closed asfimport closed 2 years ago
Nicola Crane / @thisisnic: Hi [~max_koe], thanks for reporting this in such excellent detail.
Looking at the environment variables you have set, I think you are following the path mentioned in the installation guide which results in using devtoolset and building Arrow from source.
It's the internet connection piece that I think is causing complications here. When installing the arrow package from source, pkgconfig
is used to find the relevant dependencies on the system, but the instructions for compiling with devtoolset disable this. I think this is so that we can ensure that everything has been compiled using devtoolset and so that there aren't any mismatches between compilers.
When not using pkgconfig to find dependencies, a number of the Arrow C++ dependencies require downloading from the internet, and so the installation process here has only installed the components which can be installed without an internet connection, which is the reason for the parquet writing failing.
There are potentially a few possible solutions; I think they are:
CC=/usr/bin/gcc
and CXX=/usr/bin/g++
to use the system compilers and then set LIBARROW_BINARY
and ARROW_USE_PKG_CONFIG
both to true
to download a binary package, or I'll check with some others to see if there's anything else to suggest.
Maximilian König: Hello and thanks for checking back in on my Issue.
I have questions regarding your points:
RE 1.) Should I not use the devtoolset when doing this, because then my install fails because the native compiler seems to not have multilib support? As long as I have activated the devtoolset the /usr/bin/{gcc,g++} link to the activated compilers anywas as far as I understand.
RE 2.) I do not have a comparable RHEL-Machine with internet access. I could create an AWS-RHEL instance but it would differ in several aspects from my target system. Would this be good enough to try this or does this seem pointless to you?
Further Question: Is there maybe a way to supply the third party dependencies manually? Looking back at the output, it looks a bit as if parquet support is part of these dependencies (... ARROW_PARQUET='OFF' ...)
EDIT: When I do not use the devtools environment I get the following error output. As mentioned before my google-fu indicated that it is caused by not having proper g++-multilib support installed
* installing *source* package ‘arrow’ ...
** package ‘arrow’ successfully unpacked and MD5 sums checked
** using staged installation
trying URL 'https://github.com'
Error in download.file(from_url, to_file, quiet = quietly) :
cannot open URL 'https://github.com'
*** Found local C++ source: 'tools/cpp'
*** Building C++ libraries
*** Building with MAKEFLAGS= -j2
*** Building C++ library from source, but downloading thirdparty dependencies
is not possible, so this build will turn off all thirdparty features.
See install vignette for details:
https://cran.r-project.org/web/packages/arrow/vignettes/install.html
**** arrow with SOURCE_DIR='tools/cpp' BUILD_DIR='/var/opt/sas/sasconfig/Lev1/__R_TEST/Rtmp0rE0Ey/filec416a7bea6e31' DEST_DIR='libarrow/arrow-6.0.1' CMAKE='/opt/shared/operations/tmp/a_504k5/cmake3/cmake-3.22.0-linux-x86_64/bin/cmake' EXTRA_CMAKE_FLAGS=' -DARROW_SIMD_LEVEL=NONE -DARROW_RUNTIME_SIMD_LEVEL=NONE' CC='gcc -m64 -std=gnu99' CXX='g++ -m64 -std=gnu++11' LDFLAGS='-Wl,-z,relro' ARROW_S3='OFF' ARROW_MIMALLOC='OFF' ARROW_JEMALLOC='OFF' ARROW_JSON='OFF' ARROW_PARQUET='OFF' ARROW_DATASET='OFF' ARROW_WITH_BROTLI='OFF' ARROW_WITH_BZ2='OFF' ARROW_WITH_LZ4='OFF' ARROW_WITH_SNAPPY='OFF' ARROW_WITH_ZLIB='OFF' ARROW_WITH_ZSTD='OFF' ARROW_WITH_RE2='OFF' ARROW_WITH_UTF8PROC='OFF'
++ pwd
+ : /var/opt/sas/sasconfig/Lev1/__R_TEST/Rtmpz8ZKoj/R.INSTALLc4060d4a94cf/arrow
+ : tools/cpp
+ : /var/opt/sas/sasconfig/Lev1/__R_TEST/Rtmp0rE0Ey/filec416a7bea6e31
+ : libarrow/arrow-6.0.1
+ : /opt/shared/operations/tmp/a_504k5/cmake3/cmake-3.22.0-linux-x86_64/bin/cmake
++ cd tools/cpp
++ pwd
+ SOURCE_DIR=/var/opt/sas/sasconfig/Lev1/__R_TEST/Rtmpz8ZKoj/R.INSTALLc4060d4a94cf/arrow/tools/cpp
++ mkdir -p libarrow/arrow-6.0.1
++ cd libarrow/arrow-6.0.1
++ pwd
+ DEST_DIR=/var/opt/sas/sasconfig/Lev1/__R_TEST/Rtmpz8ZKoj/R.INSTALLc4060d4a94cf/arrow/libarrow/arrow-6.0.1
+ '[' '' '!=' '' ']'
+ '[' '' = false ']'
+ ARROW_DEFAULT_PARAM=OFF
+ mkdir -p /var/opt/sas/sasconfig/Lev1/__R_TEST/Rtmp0rE0Ey/filec416a7bea6e31
+ pushd /var/opt/sas/sasconfig/Lev1/__R_TEST/Rtmp0rE0Ey/filec416a7bea6e31
/var/opt/sas/sasconfig/Lev1/__R_TEST/Rtmp0rE0Ey/filec416a7bea6e31 /var/opt/sas/sasconfig/Lev1/__R_TEST/Rtmpz8ZKoj/R.INSTALLc4060d4a94cf/arrow
+ /opt/shared/operations/tmp/a_504k5/cmake3/cmake-3.22.0-linux-x86_64/bin/cmake -DARROW_BOOST_USE_SHARED=OFF -DARROW_BUILD_TESTS=OFF -DARROW_BUILD_SHARED=OFF -DARROW_BUILD_STATIC=ON -DARROW_COMPUTE=ON -DARROW_CSV=ON -DARROW_DATASET=OFF -DARROW_DEPENDENCY_SOURCE=BUNDLED -DAWSSDK_SOURCE= -DARROW_FILESYSTEM=ON -DARROW_JEMALLOC=OFF -DARROW_MIMALLOC=OFF -DARROW_JSON=OFF -DARROW_PARQUET=OFF -DARROW_S3=OFF -DARROW_WITH_BROTLI=OFF -DARROW_WITH_BZ2=OFF -DARROW_WITH_LZ4=OFF -DARROW_WITH_RE2=OFF -DARROW_WITH_SNAPPY=OFF -DARROW_WITH_UTF8PROC=OFF -DARROW_WITH_ZLIB=OFF -DARROW_WITH_ZSTD=OFF -DARROW_VERBOSE_THIRDPARTY_BUILD=OFF -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_LIBDIR=lib -DCMAKE_INSTALL_PREFIX=/var/opt/sas/sasconfig/Lev1/__R_TEST/Rtmpz8ZKoj/R.INSTALLc4060d4a94cf/arrow/libarrow/arrow-6.0.1 -DCMAKE_EXPORT_NO_PACKAGE_REGISTRY=ON -DCMAKE_FIND_PACKAGE_NO_PACKAGE_REGISTRY=ON -DCMAKE_UNITY_BUILD=ON -DARROW_SIMD_LEVEL=NONE -DARROW_RUNTIME_SIMD_LEVEL=NONE -G 'Unix Makefiles' /var/opt/sas/sasconfig/Lev1/__R_TEST/Rtmpz8ZKoj/R.INSTALLc4060d4a94cf/arrow/tools/cpp
-- Building using CMake version: 3.22.0
-- The C compiler identification is GNU 4.8.5
-- The CXX compiler identification is GNU 4.8.5
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/gcc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/g++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Arrow version: 6.0.1 (full: '6.0.1')
-- Arrow SO version: 600 (full: 600.1.0)
-- clang-tidy not found
-- clang-format not found
-- Could NOT find ClangTools (missing: CLANG_FORMAT_BIN CLANG_TIDY_BIN)
-- infer not found
fatal: not a git repository (or any parent up to mount point /var/opt/sas)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
-- Found Python3: /usr/bin/python3.6 (found version "3.6.8") found components: Interpreter
-- Found cpplint executable at /var/opt/sas/sasconfig/Lev1/__R_TEST/Rtmpz8ZKoj/R.INSTALLc4060d4a94cf/arrow/tools/cpp/build-support/cpplint.py
-- System processor: x86_64
-- Performing Test CXX_SUPPORTS_SSE4_2
-- Performing Test CXX_SUPPORTS_SSE4_2 - Success
-- Performing Test CXX_SUPPORTS_AVX2
-- Performing Test CXX_SUPPORTS_AVX2 - Failed
-- Performing Test CXX_SUPPORTS_AVX512
-- Performing Test CXX_SUPPORTS_AVX512 - Failed
-- Arrow build warning level: PRODUCTION
Using ld linker
Configured for RELEASE build (set with cmake -DCMAKE_BUILD_TYPE={release,debug,...})
-- Build Type: RELEASE
-- Performing Test CXX_LINKER_SUPPORTS_VERSION_SCRIPT
-- Performing Test CXX_LINKER_SUPPORTS_VERSION_SCRIPT - Success
-- Using BUNDLED approach to find dependencies
-- ARROW_ABSL_BUILD_VERSION: 20210324.2
-- ARROW_ABSL_BUILD_SHA256_CHECKSUM: 59b862f50e710277f8ede96f083a5bb8d7c9595376146838b9580be90374ee1f
-- ARROW_AWSSDK_BUILD_VERSION: 1.8.133
-- ARROW_AWSSDK_BUILD_SHA256_CHECKSUM: d6c495bc06be5e21dac716571305d77437e7cfd62a2226b8fe48d9ab5785a8d6
-- ARROW_AWS_CHECKSUMS_BUILD_VERSION: v0.1.10
-- ARROW_AWS_CHECKSUMS_BUILD_SHA256_CHECKSUM: c9d0100a5743765fc8034e34e2310f77f59b1adab6f2e2f2d4d2a3bd81b2a36d
-- ARROW_AWS_C_COMMON_BUILD_VERSION: v0.6.9
-- ARROW_AWS_C_COMMON_BUILD_SHA256_CHECKSUM: 928a3e36f24d1ee46f9eec360ec5cebfe8b9b8994fe39d4fa74ff51aebb12717
-- ARROW_AWS_C_EVENT_STREAM_BUILD_VERSION: v0.1.5
-- ARROW_AWS_C_EVENT_STREAM_BUILD_SHA256_CHECKSUM: f1b423a487b5d6dca118bfc0d0c6cc596dc476b282258a3228e73a8f730422d4
-- ARROW_BOOST_BUILD_VERSION: 1.75.0
-- ARROW_BOOST_BUILD_SHA256_CHECKSUM: cb97b36e2295a321c34851e0455bc2630ad6c691d4f9f589170066cd11c835b4
-- ARROW_BROTLI_BUILD_VERSION: v1.0.9
-- ARROW_BROTLI_BUILD_SHA256_CHECKSUM: f9e8d81d0405ba66d181529af42a3354f838c939095ff99930da6aa9cdf6fe46
-- ARROW_BZIP2_BUILD_VERSION: 1.0.8
-- ARROW_BZIP2_BUILD_SHA256_CHECKSUM: ab5a03176ee106d3f0fa90e381da478ddae405918153cca248e682cd0c4a2269
-- ARROW_CARES_BUILD_VERSION: 1.17.1
-- ARROW_CARES_BUILD_SHA256_CHECKSUM: d73dd0f6de824afd407ce10750ea081af47eba52b8a6cb307d220131ad93fc40
-- ARROW_CRC32C_BUILD_VERSION: 1.1.2
-- ARROW_CRC32C_BUILD_SHA256_CHECKSUM: ac07840513072b7fcebda6e821068aa04889018f24e10e46181068fb214d7e56
-- ARROW_GBENCHMARK_BUILD_VERSION: v1.5.2
-- ARROW_GBENCHMARK_BUILD_SHA256_CHECKSUM: dccbdab796baa1043f04982147e67bb6e118fe610da2c65f88912d73987e700c
-- ARROW_GFLAGS_BUILD_VERSION: v2.2.2
-- ARROW_GFLAGS_BUILD_SHA256_CHECKSUM: 34af2f15cf7367513b352bdcd2493ab14ce43692d2dcd9dfc499492966c64dcf
-- ARROW_GLOG_BUILD_VERSION: v0.4.0
-- ARROW_GLOG_BUILD_SHA256_CHECKSUM: f28359aeba12f30d73d9e4711ef356dc842886968112162bc73002645139c39c
-- ARROW_GOOGLE_CLOUD_CPP_BUILD_VERSION: v1.31.1
-- ARROW_GOOGLE_CLOUD_CPP_BUILD_SHA256_CHECKSUM: dc7cbf95b506a84b48cf71e0462985d262183edeaabdacaaee2109852394a609
-- ARROW_GRPC_BUILD_VERSION: v1.35.0
-- ARROW_GRPC_BUILD_SHA256_CHECKSUM: 27dd2fc5c9809ddcde8eb6fa1fa278a3486566dfc28335fca13eb8df8bd3b958
-- ARROW_GTEST_BUILD_VERSION: 1.10.0
-- ARROW_GTEST_BUILD_SHA256_CHECKSUM: 9dc9157a9a1551ec7a7e43daea9a694a0bb5fb8bec81235d8a1e6ef64c716dcb
-- ARROW_JEMALLOC_BUILD_VERSION: 5.2.1
-- ARROW_JEMALLOC_BUILD_SHA256_CHECKSUM: 34330e5ce276099e2e8950d9335db5a875689a4c6a56751ef3b1d8c537f887f6
-- ARROW_LZ4_BUILD_VERSION: v1.9.3
-- ARROW_LZ4_BUILD_SHA256_CHECKSUM: 030644df4611007ff7dc962d981f390361e6c97a34e5cbc393ddfbe019ffe2c1
-- ARROW_MIMALLOC_BUILD_VERSION: v1.7.2
-- ARROW_MIMALLOC_BUILD_SHA256_CHECKSUM: b1912e354565a4b698410f7583c0f83934a6dbb3ade54ab7ddcb1569320936bd
-- ARROW_NLOHMANN_JSON_BUILD_VERSION: v3.10.2
-- ARROW_NLOHMANN_JSON_BUILD_SHA256_CHECKSUM: 081ed0f9f89805c2d96335c3acfa993b39a0a5b4b4cef7edb68dd2210a13458c
-- ARROW_ORC_BUILD_VERSION: 1.7.0
-- ARROW_ORC_BUILD_SHA256_CHECKSUM: 45d6ba9149ffa2aaa168d61ab326f61181861c94529f26da3918a9aa2f801e39
-- ARROW_PROTOBUF_BUILD_VERSION: v3.17.3
-- ARROW_PROTOBUF_BUILD_SHA256_CHECKSUM: 77ad26d3f65222fd96ccc18b055632b0bfedf295cb748b712a98ba1ac0b704b2
-- ARROW_RAPIDJSON_BUILD_VERSION: 1a803826f1197b5e30703afe4b9c0e7dd48074f5
-- ARROW_RAPIDJSON_BUILD_SHA256_CHECKSUM: 0b6b780b6c534bfb0b23d29910bfe361e486bcfeaf106db8bc8995792072905a
-- ARROW_RE2_BUILD_VERSION: 2021-02-02
-- ARROW_RE2_BUILD_SHA256_CHECKSUM: 1396ab50c06c1a8885fb68bf49a5ecfd989163015fd96699a180d6414937f33f
-- ARROW_SNAPPY_BUILD_VERSION: 1.1.8
-- ARROW_SNAPPY_BUILD_SHA256_CHECKSUM: 16b677f07832a612b0836178db7f374e414f94657c138e6993cbfc5dcc58651f
-- ARROW_THRIFT_BUILD_VERSION: 0.13.0
-- ARROW_THRIFT_BUILD_SHA256_CHECKSUM: 7ad348b88033af46ce49148097afe354d513c1fca7c607b59c33ebb6064b5179
-- ARROW_UTF8PROC_BUILD_VERSION: v2.6.1
-- ARROW_UTF8PROC_BUILD_SHA256_CHECKSUM: 4c06a9dc4017e8a2438ef80ee371d45868bda2237a98b26554de7a95406b283b
-- ARROW_XSIMD_BUILD_VERSION: aeec9c872c8b475dedd7781336710f2dd2666cb2
-- ARROW_XSIMD_BUILD_SHA256_CHECKSUM: 0a841e6c8acf216150e4fc19fca8e29fbab9614b56ac7b96e56019264ca27b26
-- ARROW_ZLIB_BUILD_VERSION: 1.2.11
-- ARROW_ZLIB_BUILD_SHA256_CHECKSUM: c3e5e9fdd5004dcb542feda5ee4f0ff0744628baf8ed2dd5d66f8ca1197cb1a1
-- ARROW_ZSTD_BUILD_VERSION: v1.5.0
-- ARROW_ZSTD_BUILD_SHA256_CHECKSUM: 0d9ade222c64e912d6957b11c923e214e2e010a18f39bec102f572e693ba2867
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed
-- Check if compiler accepts -pthread
-- Check if compiler accepts -pthread - yes
-- Found Threads: TRUE
-- Looking for __SIZEOF_INT128__
-- Looking for __SIZEOF_INT128__ - found
-- Building without OpenSSL support. Minimum OpenSSL version 1.0.2 required.
-- Found hdfs.h at: /var/opt/sas/sasconfig/Lev1/__R_TEST/Rtmpz8ZKoj/R.INSTALLc4060d4a94cf/arrow/tools/cpp/thirdparty/hadoop/include/hdfs.h
-- All bundled static libraries:
-- CMAKE_C_FLAGS: -O3 -DNDEBUG -Wall -Wno-attributes
-- CMAKE_CXX_FLAGS: -Wno-subobject-linkage -O3 -DNDEBUG -Wall -Wno-attributes
-- Looking for backtrace
-- Looking for backtrace - found
-- backtrace facility detected in default set of libraries
-- Found Backtrace: /usr/include
-- ---------------------------------------------------------------------
-- Arrow version: 6.0.1
--
-- Build configuration summary:
-- Generator: Unix Makefiles
-- Build type: RELEASE
-- Source directory: /var/opt/sas/sasconfig/Lev1/__R_TEST/Rtmpz8ZKoj/R.INSTALLc4060d4a94cf/arrow/tools/cpp
-- Install prefix: /var/opt/sas/sasconfig/Lev1/__R_TEST/Rtmpz8ZKoj/R.INSTALLc4060d4a94cf/arrow/libarrow/arrow-6.0.1
--
-- Compile and link options:
--
-- ARROW_CXXFLAGS="" [default=""]
-- Compiler flags to append when compiling Arrow
-- ARROW_BUILD_STATIC=ON [default=ON]
-- Build static libraries
-- ARROW_BUILD_SHARED=OFF [default=ON]
-- Build shared libraries
-- ARROW_PACKAGE_KIND="" [default=""]
-- Arbitrary string that identifies the kind of package
-- (for informational purposes)
-- ARROW_GIT_ID="" [default=""]
-- The Arrow git commit id (if any)
-- ARROW_GIT_DESCRIPTION="" [default=""]
-- The Arrow git commit description (if any)
-- ARROW_NO_DEPRECATED_API=OFF [default=OFF]
-- Exclude deprecated APIs from build
-- ARROW_USE_CCACHE=ON [default=ON]
-- Use ccache when compiling (if available)
-- ARROW_USE_LD_GOLD=OFF [default=OFF]
-- Use ld.gold for linking on Linux (if available)
-- ARROW_USE_PRECOMPILED_HEADERS=OFF [default=OFF]
-- Use precompiled headers when compiling
-- ARROW_SIMD_LEVEL=NONE [default=NONE|SSE4_2|AVX2|AVX512|NEON|DEFAULT]
-- Compile-time SIMD optimization level
-- ARROW_RUNTIME_SIMD_LEVEL=NONE [default=NONE|SSE4_2|AVX2|AVX512|MAX]
-- Max runtime SIMD optimization level
-- ARROW_ARMV8_ARCH=armv8-a [default=armv8-a|armv8-a+crc+crypto]
-- Arm64 arch and extensions
-- ARROW_ALTIVEC=ON [default=ON]
-- Build with Altivec if compiler has support
-- ARROW_RPATH_ORIGIN=OFF [default=OFF]
-- Build Arrow libraries with RATH set to $ORIGIN
-- ARROW_INSTALL_NAME_RPATH=ON [default=ON]
-- Build Arrow libraries with install_name set to @rpath
-- ARROW_GGDB_DEBUG=ON [default=ON]
-- Pass -ggdb flag to debug builds
--
-- Test and benchmark options:
--
-- ARROW_BUILD_EXAMPLES=OFF [default=OFF]
-- Build the Arrow examples
-- ARROW_BUILD_TESTS=OFF [default=OFF]
-- Build the Arrow googletest unit tests
-- ARROW_ENABLE_TIMING_TESTS=ON [default=ON]
-- Enable timing-sensitive tests
-- ARROW_BUILD_INTEGRATION=OFF [default=OFF]
-- Build the Arrow integration test executables
-- ARROW_BUILD_BENCHMARKS=OFF [default=OFF]
-- Build the Arrow micro benchmarks
-- ARROW_BUILD_BENCHMARKS_REFERENCE=OFF [default=OFF]
-- Build the Arrow micro reference benchmarks
-- ARROW_TEST_LINKAGE=static [default=shared|static]
-- Linkage of Arrow libraries with unit tests executables.
-- ARROW_FUZZING=OFF [default=OFF]
-- Build Arrow Fuzzing executables
-- ARROW_LARGE_MEMORY_TESTS=OFF [default=OFF]
-- Enable unit tests which use large memory
--
-- Lint options:
--
-- ARROW_ONLY_LINT=OFF [default=OFF]
-- Only define the lint and check-format targets
-- ARROW_VERBOSE_LINT=OFF [default=OFF]
-- If off, 'quiet' flags will be passed to linting tools
-- ARROW_GENERATE_COVERAGE=OFF [default=OFF]
-- Build with C++ code coverage enabled
--
-- Checks options:
--
-- ARROW_TEST_MEMCHECK=OFF [default=OFF]
-- Run the test suite using valgrind --tool=memcheck
-- ARROW_USE_ASAN=OFF [default=OFF]
-- Enable Address Sanitizer checks
-- ARROW_USE_TSAN=OFF [default=OFF]
-- Enable Thread Sanitizer checks
-- ARROW_USE_UBSAN=OFF [default=OFF]
-- Enable Undefined Behavior sanitizer checks
--
-- Project component options:
--
-- ARROW_BUILD_UTILITIES=OFF [default=OFF]
-- Build Arrow commandline utilities
-- ARROW_COMPUTE=ON [default=OFF]
-- Build the Arrow Compute Modules
-- ARROW_CSV=ON [default=OFF]
-- Build the Arrow CSV Parser Module
-- ARROW_CUDA=OFF [default=OFF]
-- Build the Arrow CUDA extensions (requires CUDA toolkit)
-- ARROW_DATASET=OFF [default=OFF]
-- Build the Arrow Dataset Modules
-- ARROW_FILESYSTEM=ON [default=OFF]
-- Build the Arrow Filesystem Layer
-- ARROW_FLIGHT=OFF [default=OFF]
-- Build the Arrow Flight RPC System (requires GRPC, Protocol Buffers)
-- ARROW_GANDIVA=OFF [default=OFF]
-- Build the Gandiva libraries
-- ARROW_GCS=OFF [default=OFF]
-- Build Arrow with GCS support (requires the GCloud SDK for C++)
-- ARROW_HDFS=OFF [default=OFF]
-- Build the Arrow HDFS bridge
-- ARROW_HIVESERVER2=OFF [default=OFF]
-- Build the HiveServer2 client and Arrow adapter
-- ARROW_IPC=ON [default=ON]
-- Build the Arrow IPC extensions
-- ARROW_JEMALLOC=OFF [default=ON]
-- Build the Arrow jemalloc-based allocator
-- ARROW_JNI=OFF [default=OFF]
-- Build the Arrow JNI lib
-- ARROW_JSON=OFF [default=OFF]
-- Build Arrow with JSON support (requires RapidJSON)
-- ARROW_MIMALLOC=OFF [default=OFF]
-- Build the Arrow mimalloc-based allocator
-- ARROW_PARQUET=OFF [default=OFF]
-- Build the Parquet libraries
-- ARROW_ORC=OFF [default=OFF]
-- Build the Arrow ORC adapter
-- ARROW_PLASMA=OFF [default=OFF]
-- Build the plasma object store along with Arrow
-- ARROW_PLASMA_JAVA_CLIENT=OFF [default=OFF]
-- Build the plasma object store java client
-- ARROW_PYTHON=OFF [default=OFF]
-- Build the Arrow CPython extensions
-- ARROW_S3=OFF [default=OFF]
-- Build Arrow with S3 support (requires the AWS SDK for C++)
-- ARROW_TENSORFLOW=OFF [default=OFF]
-- Build Arrow with TensorFlow support enabled
-- ARROW_TESTING=OFF [default=OFF]
-- Build the Arrow testing libraries
--
-- Thirdparty toolchain options:
--
-- ARROW_DEPENDENCY_SOURCE=BUNDLED [default=AUTO|BUNDLED|SYSTEM|CONDA|VCPKG|BREW]
-- Method to use for acquiring arrow's build dependencies
-- ARROW_VERBOSE_THIRDPARTY_BUILD=OFF [default=OFF]
-- Show output from ExternalProjects rather than just logging to files
-- ARROW_DEPENDENCY_USE_SHARED=ON [default=ON]
-- Link to shared libraries
-- ARROW_BOOST_USE_SHARED=OFF [default=ON]
-- Rely on boost shared libraries where relevant
-- ARROW_BROTLI_USE_SHARED=ON [default=ON]
-- Rely on Brotli shared libraries where relevant
-- ARROW_BZ2_USE_SHARED=ON [default=ON]
-- Rely on Bz2 shared libraries where relevant
-- ARROW_GFLAGS_USE_SHARED=ON [default=ON]
-- Rely on GFlags shared libraries where relevant
-- ARROW_GRPC_USE_SHARED=ON [default=ON]
-- Rely on gRPC shared libraries where relevant
-- ARROW_LZ4_USE_SHARED=ON [default=ON]
-- Rely on lz4 shared libraries where relevant
-- ARROW_OPENSSL_USE_SHARED=ON [default=ON]
-- Rely on OpenSSL shared libraries where relevant
-- ARROW_PROTOBUF_USE_SHARED=ON [default=ON]
-- Rely on Protocol Buffers shared libraries where relevant
-- ARROW_THRIFT_USE_SHARED=ON [default=ON]
-- Rely on thrift shared libraries where relevant
-- ARROW_UTF8PROC_USE_SHARED=ON [default=ON]
-- Rely on utf8proc shared libraries where relevant
-- ARROW_SNAPPY_USE_SHARED=ON [default=ON]
-- Rely on snappy shared libraries where relevant
-- ARROW_UTF8PROC_USE_SHARED=ON [default=ON]
-- Rely on utf8proc shared libraries where relevant
-- ARROW_ZSTD_USE_SHARED=ON [default=ON]
-- Rely on zstd shared libraries where relevant
-- ARROW_USE_GLOG=OFF [default=OFF]
-- Build libraries with glog support for pluggable logging
-- ARROW_WITH_BACKTRACE=ON [default=ON]
-- Build with backtrace support
-- ARROW_WITH_BROTLI=OFF [default=OFF]
-- Build with Brotli compression
-- ARROW_WITH_BZ2=OFF [default=OFF]
-- Build with BZ2 compression
-- ARROW_WITH_LZ4=OFF [default=OFF]
-- Build with lz4 compression
-- ARROW_WITH_SNAPPY=OFF [default=OFF]
-- Build with Snappy compression
-- ARROW_WITH_ZLIB=OFF [default=OFF]
-- Build with zlib compression
-- ARROW_WITH_ZSTD=OFF [default=OFF]
-- Build with zstd compression
-- ARROW_WITH_UTF8PROC=OFF [default=ON]
-- Build with support for Unicode properties using the utf8proc library
-- (only used if ARROW_COMPUTE is ON or ARROW_GANDIVA is ON)
-- ARROW_WITH_RE2=OFF [default=ON]
-- Build with support for regular expressions using the re2 library
-- (only used if ARROW_COMPUTE or ARROW_GANDIVA is ON)
--
-- Parquet options:
--
-- PARQUET_MINIMAL_DEPENDENCY=OFF [default=OFF]
-- Depend only on Thirdparty headers to build libparquet.
-- Always OFF if building binaries
-- PARQUET_BUILD_EXECUTABLES=OFF [default=OFF]
-- Build the Parquet executable CLI tools. Requires static libraries to be built.
-- PARQUET_BUILD_EXAMPLES=OFF [default=OFF]
-- Build the Parquet examples. Requires static libraries to be built.
-- PARQUET_REQUIRE_ENCRYPTION=OFF [default=OFF]
-- Build support for encryption. Fail if OpenSSL is not found
--
-- Gandiva options:
--
-- ARROW_GANDIVA_JAVA=OFF [default=OFF]
-- Build the Gandiva JNI wrappers
-- ARROW_GANDIVA_STATIC_LIBSTDCPP=OFF [default=OFF]
-- Include -static-libstdc++ -static-libgcc when linking with
-- Gandiva static libraries
-- ARROW_GANDIVA_PC_CXX_FLAGS="" [default=""]
-- Compiler flags to append when pre-compiling Gandiva operations
--
-- Advanced developer options:
--
-- ARROW_EXTRA_ERROR_CONTEXT=OFF [default=OFF]
-- Compile with extra error context (line numbers, code)
-- ARROW_OPTIONAL_INSTALL=OFF [default=OFF]
-- If enabled install ONLY targets that have already been built. Please be
-- advised that if this is enabled 'install' will fail silently on components
-- that have not been built
-- Outputting build configuration summary to /var/opt/sas/sasconfig/Lev1/__R_TEST/Rtmp0rE0Ey/filec416a7bea6e31/cmake_summary.json
-- Configuring done
-- Generating done
-- Build files have been written to: /var/opt/sas/sasconfig/Lev1/__R_TEST/Rtmp0rE0Ey/filec416a7bea6e31
+ /opt/shared/operations/tmp/a_504k5/cmake3/cmake-3.22.0-linux-x86_64/bin/cmake --build . --target install
[ 0%] Built target toolchain
[ 0%] Built target arrow_dependencies
[ 3%] Building CXX object src/arrow/CMakeFiles/arrow_objlib.dir/Unity/unity_20_cxx.cxx.o
[ 7%] Building CXX object src/arrow/CMakeFiles/arrow_objlib.dir/Unity/unity_21_cxx.cxx.o
In file included from /var/opt/sas/sasconfig/Lev1/__R_TEST/Rtmpz8ZKoj/R.INSTALLc4060d4a94cf/arrow/tools/cpp/src/arrow/filesystem/util_internal.h:20:0,
from /var/opt/sas/sasconfig/Lev1/__R_TEST/Rtmpz8ZKoj/R.INSTALLc4060d4a94cf/arrow/tools/cpp/src/arrow/filesystem/util_internal.cc:18,
from /var/opt/sas/sasconfig/Lev1/__R_TEST/Rtmp0rE0Ey/filec416a7bea6e31/src/arrow/CMakeFiles/arrow_objlib.dir/Unity/unity_21_cxx.cxx:3:
/usr/include/c++/4.8.2/cstdint:38:28: fatal error: bits/c++config.h: No such file or directory
#include <bits/c++config.h>
^
compilation terminated.
In file included from /usr/include/c++/4.8.2/memory:62:0,
from /var/opt/sas/sasconfig/Lev1/__R_TEST/Rtmpz8ZKoj/R.INSTALLc4060d4a94cf/arrow/tools/cpp/src/arrow/compute/exec/hash_join_dict.h:20,
from /var/opt/sas/sasconfig/Lev1/__R_TEST/Rtmpz8ZKoj/R.INSTALLc4060d4a94cf/arrow/tools/cpp/src/arrow/compute/exec/hash_join_dict.cc:18,
from /var/opt/sas/sasconfig/Lev1/__R_TEST/Rtmp0rE0Ey/filec416a7bea6e31/src/arrow/CMakeFiles/arrow_objlib.dir/Unity/unity_20_cxx.cxx:3:
/usr/include/c++/4.8.2/bits/stl_algobase.h:59:28: fatal error: bits/c++config.h: No such file or directory
#include <bits/c++config.h>
^
compilation terminated.
make[2]: *** [src/arrow/CMakeFiles/arrow_objlib.dir/Unity/unity_21_cxx.cxx.o] Error 1
make[2]: *** Waiting for unfinished jobs....
make[2]: *** [src/arrow/CMakeFiles/arrow_objlib.dir/Unity/unity_20_cxx.cxx.o] Error 1
make[1]: *** [src/arrow/CMakeFiles/arrow_objlib.dir/all] Error 2
gmake: *** [all] Error 2
**** Error building Arrow C++.
------------------------- NOTE ---------------------------
There was an issue preparing the Arrow C++ libraries.
See https://arrow.apache.org/docs/r/articles/install.html
---------------------------------------------------------
ERROR: configuration failed for package ‘arrow’
* removing ‘/var/opt/sas/sasconfig/Lev1/__R_TEST/__R4/arrow’The downloaded source packages are in
‘/var/opt/sas/sasconfig/Lev1/__R_TEST/RtmpqtDtv3/downloaded_packages’
Warning message:
In install.packages("arrow", "/var/opt/sas/sasconfig/Lev1/__R_TEST/__R4") :
installation of package ‘arrow’ had non-zero exit status
>
Neal Richardson / @nealrichardson: @thisisnic's suggestion #2 is correct. There is this message in the installation output:
*** Building C++ library from source, but downloading thirdparty dependencies
is not possible, so this build will turn off all thirdparty features.
See install vignette for details:
https://cran.r-project.org/web/packages/arrow/vignettes/install.html
You want to create_package_with_all_dependencies()
on a connected computer as the "Offline installation" suggests. This just downloads the dependencies into a single package, it does not compile or build anything, so it can be any computer, does not need to match your system configuration.
Maximilian König: Thank you very much for your input, this resolved the problem.
I encountered different problems on the way, that slowed the progress significantly: Both on MacOS as well as on Windows 10 the function create_package_with_all_dependencies()
did not function properly – both throwing different kinds of errors along the way –, so that I indeed had to set up AWS linux machines to use arrow with this function, and it had to be rather large machines at that to get it to fully compile.
Nicola Crane / @thisisnic: Thanks for sticking with this, [~max_koe].
I don't suppose, if you still have access to the error logs, you'd mind posting them here? It'd be great to be able to inspect them and see if we need to change either the code itself or perhaps the instructions about which machines to compile on for future users. No worries if not though!
When you say "rather large", how large do you mean? I know there are some dependencies (e.g. Unity) which can lead to the need for higher resources, but I'd like to make sure there's nothing we've overlooked, if we do update the instructions.
Maximilian König:
I tried the installation both with t2.micro
and t3.medium
which are rather small and failed there. It worked with i3en.large
– this one is sized comparably to a desktop PC I would say:
Type | Subtype | CPU (Cores) | Ram (GB) | I would suggest opening a new issue for the Errors on MacOS / Windows, and closing this one as it has been solved. |
---|
Neal Richardson / @nealrichardson:
create_package_with_all_dependencies()
doesn't compile anything, it just downloads things and puts them into a single tarball: https://github.com/apache/arrow/blob/master/r/R/install-arrow.R#L191-L239
It should be platform independent though, so if it isn't, that should be fixed.
Maximilian König: I am aware that the function does not compile anything. I was saying that installing the arrow-package on AWS-Linux-Servers did take larger machines to finish compiling. (before I could use the function inside the package).
For the downloading Issue on MacOS i created a separate Issue now: ARROW-15092
Nicola Crane / @thisisnic: Thanks for opening that other ticket!
I tried to install Arrow inside R on a RHEL system inside an devtoolset-9 environment in a fresh R-environment (empty R_LIBS).
Here is what I did:
Here is the prompt output, and the failing writing of a parquet file afterwards: Note that I am using a proxy system to download the R-Packages and connot access the internet directly to download other dependencies.
Reporter: Maximilian König
Original Issue Attachments:
Note: This issue was originally created as ARROW-15000. Please see the migration documentation for further details.