finos / perspective

A data visualization and analytics component, especially well-suited for large and/or streaming datasets.
https://perspective.finos.org/
Apache License 2.0
8.58k stars 1.19k forks source link

Arrow headers installed to /usr/local can break perspective's subdirectory Arrow build #2792

Closed texodus closed 1 month ago

texodus commented 1 month ago

Discussed in https://github.com/finos/perspective/discussions/2791

Originally posted by **tomjakubowski** October 15, 2024 ## Bug Report ### Steps to Reproduce: 1. Build and install Apache Arrow to /usr/local 2. Try and build perspective, with its own included Arrow dependency ### Expected Result: Perspective builds Arrow with its own set of options (defined in FindInstallDependency.cmake) ### Actual Result: Flags from the global installation, defined in /usr/local/include/arrow/util/config.h, will interfere with perspective's Arrow build. In my case, that global config file had `#define ARROW_JEMALLOC`, while perspective's build had `(set ARROW_JEMALLOC OFF)` in cmake, which are inconsistent flags and so Arrow was miscompiled with some missing symbols referencing their Jemalloc memory pool implementation. ``` nm -mg /Users/tom/perspective/perspective/rust/perspective-python/perspective/perspective.abi3.so | grep malloc (undefined) external __ZN5arrow11memory_pool8internal17JemallocAllocator13ReleaseUnusedEv (dynamically looked up) (undefined) external __ZN5arrow11memory_pool8internal17JemallocAllocator15AllocateAlignedExxPPh (dynamically looked up) (undefined) external __ZN5arrow11memory_pool8internal17JemallocAllocator17DeallocateAlignedEPhxx (dynamically looked up) (undefined) external __ZN5arrow11memory_pool8internal17JemallocAllocator17ReallocateAlignedExxxPPh (dynamically looked up) ``` Found this by running a `make VERBOSE=1` from the arrow-build directory (at `rust/target/release/build/perspective-server-bb669fae7f7ee29c/out/build/arrow-build/`). Logs below. Notice that the `-I` for /usr/local/include comes before the `-I` for arrow's own source file. ``` cd /Users/tom/perspective/perspective/rust/target/release/build/perspective-server-bb669fae7f7ee29c/out/build/arrow-build/src/arrow && /usr/bin/c++ -DARROW_WITH_TIMING_TESTS -I/Users/tom/perspective/perspective/rust/perspective-server/cpp/perspective/src/include -I/usr/local/include -I/Users/tom/perspective/perspective/rust/perspective-server/cpp/perspective/../../python/perspective/perspective/perspective/include -I/Users/tom/perspective/perspective/rust/target/release/build/perspective-server-bb669fae7f7ee29c/out/build/arrow-src/cpp/src -I/Users/tom/perspective/perspective/rust/target/release/build/perspective-server-bb669fae7f7ee29c/out/build/arrow-src/cpp/src/generated -isystem /Users/tom/perspective/perspective/rust/target/release/build/perspective-server-bb669fae7f7ee29c/out/build/date-src/extras/date/include -isystem /Users/tom/perspective/perspective/rust/target/release/build/perspective-server-bb669fae7f7ee29c/out/build/date-src/include -isystem /Users/tom/perspective/perspective/rust/target/release/build/perspective-server-bb669fae7f7ee29c/out/build/date-src -isystem /Users/tom/perspective/perspective/rust/target/release/build/perspective-server-bb669fae7f7ee29c/out/build/hopscotch-src/extras/hopscotch/include -isystem /Users/tom/perspective/perspective/rust/target/release/build/perspective-server-bb669fae7f7ee29c/out/build/hopscotch-src/include -isystem /Users/tom/perspective/perspective/rust/target/release/build/perspective-server-bb669fae7f7ee29c/out/build/hopscotch-src -isystem /Users/tom/perspective/perspective/rust/target/release/build/perspective-server-bb669fae7f7ee29c/out/build/ordered-map-src/extras/ordered-map/include -isystem /Users/tom/perspective/perspective/rust/target/release/build/perspective-server-bb669fae7f7ee29c/out/build/ordered-map-src/include -isystem /Users/tom/perspective/perspective/rust/target/release/build/perspective-server-bb669fae7f7ee29c/out/build/ordered-map-src -isystem /Users/tom/perspective/perspective/rust/target/release/build/perspective-server-bb669fae7f7ee29c/out/build/rapidjson-src/extras/rapidjson/include -isystem /Users/tom/perspective/perspective/rust/target/release/build/perspective-server-bb669fae7f7ee29c/out/build/rapidjson-src/include -isystem /Users/tom/perspective/perspective/rust/target/release/build/perspective-server-bb669fae7f7ee29c/out/build/rapidjson-src -isystem /Users/tom/perspective/perspective/rust/target/release/build/perspective-server-bb669fae7f7ee29c/out/build/arrow-build/src -fno-aligned-new -O3 -Qunused-arguments -fcolor-diagnostics -Wall -Wno-unknown-warning-option -Wno-pass-failed -O3 -DNDEBUG -O2 -std=c++17 -arch arm64 -isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX13.3.sdk -fPIC -MD -MT arrow-build/src/arrow/CMakeFiles/arrow_memory_pool.dir/memory_pool.cc.o -MF CMakeFiles/arrow_memory_pool.dir/memory_pool.cc.o.d -o CMakeFiles/arrow_memory_pool.dir/memory_pool.cc.o -c /Users/tom/perspective/perspective/rust/target/release/build/perspective-server-bb669fae7f7ee29c/out/build/arrow-src/cpp/src/arrow/memory_pool.cc ``` ### Environment: macOS 13.3.1 perspective master, 13b26d777f1527ea2fac8e3778b266e3050cd7ea ### Additional Context: [Here](https://github.com/apache/arrow/blob/main/cpp/src/arrow/memory_pool.cc#L39) is the `#include` which was picking up from /usr/local instead of the build directory. This seems like a potentially common hazard/footgun for people building from source, who may have an installed Arrow elsewhere on their system. I do not know if a libarrow installed to /usr (as might be done by a Linux distro package manager) causes the same problem but would be worth investigating.
tomjakubowski commented 1 month ago

What I'm working on to fix this is to remove all calls to include_directories and instead to collect them into a single target_include_directories call on just the psp target. This will help us better control the order of include paths when building perspective. Importantly, we want Boost_INCLUDE_DIRS to be last in the list, because it's likely to be in a system directory where other dependencies with headers, like re2 or arrow, might be installed.

It's possible I'll need to back some of those out in order to get dependencies to build, but I hope not!

tomjakubowski commented 1 month ago

It would be even better not to depend on find_package(Boost), and add Boost as a psp_build_dep(), which should be easier now that we've discovered in the Conda work that the build only requires header-only Boost.

But I'd like to punt on that for future work and instead do the cruder thing first with target_include_directories. Boost would be a nice first step at working out how we can support either find_package() or psp_build_dep() for each of our dependencies. This will help us further down the road of making the Conda build fully dynamically link its dependencies.