apache / incubator-gluten

Gluten is a middle layer responsible for offloading JVM-based SQL engines' execution to native engines.
https://gluten.apache.org/
Apache License 2.0
1.2k stars 434 forks source link

[VL] How to bundle third party libs #7645

Open z-anderson opened 1 week ago

z-anderson commented 1 week ago

Problem description

I'm building velox with static linking ./dev/builddeps-veloxbe.sh --enable_vcpkg=ON --spark_version=3.5 . It produces 2 .so files, libgluten.so and libvelox.so in cpp/build/releases. The libvelox.so is missing links for folly, protobuf, and arrow (from nm -u). It seems like we'll need a third-party lib jar with these libraries, in addition to a jar with gluten/velox. How do you recommend bundling the binaries so we can use them in another project, please? Thank you!

System information

Velox System Info v0.0.2 Commit: 88912821cef4795c667669c3b7f6f0dc3eebd098 CMake Version: 3.16.3 System: Linux-5.15.0-1070-aws Arch: x86_64 CPU Name: Model name: Intel(R) Xeon(R) Platinum 8375C CPU @ 2.90GHz C++ Compiler: /usr/bin/c++ C++ Compiler Version: 9.4.0 C Compiler: /usr/bin/cc C Compiler Version: 9.4.0 CMake Prefix Path: /usr/local;/usr;/;/usr;/usr/local;/usr/X11R6;/usr/pkg;/opt

\nThe results will be copied to your clipboard if xclip is installed.

CMake log

No response

zhztheplayer commented 1 week ago

The libvelox.so is missing links for folly, protobuf, and arrow (from nm -u)

@PHILO-HE

And do you have docker environment set?

If yes I'd recommend you to follow https://github.com/apache/incubator-gluten/tree/main/tools/gluten-te/centos/examples/buildhere-veloxbe-portable-libs to create the build. It's the most stable way so far.

z-anderson commented 1 week ago

Thank you @zhztheplayer. We're trying to build gluten and use it (along with gluten and the necessary third party libraries) in another repository. It seems like we'll need a third-party lib jar with these libraries, in addition to a jar with gluten/velox. Do you know if that's right, please?

zhztheplayer commented 1 week ago

It seems like we'll need a third-party lib jar with these libraries, in addition to a jar with gluten/velox

Usually only libgluten.so / libvelox.so are needed when using static build with VCPKG. Not sure why your case other libraries like folly is required. Do you have any customization on Gluten's build procedure?

z-anderson commented 1 week ago

Arrow and protobuf are missing links in libvelox.so (I can see it in nm -u output). We don't have any customization.

FelixYBW commented 1 week ago

What's your output of ldd libvelox.so? This is mine:

 ldd ./cpp/build/releases/libvelox.so
        linux-vdso.so.1 (0x00007ffd14a91000)
        libgluten.so => /home/sparkuser/gluten/cpp/build/releases/libgluten.so (0x00007f0329e00000)
        libresolv.so.2 => /lib/x86_64-linux-gnu/libresolv.so.2 (0x00007f0331d08000)
        libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f032b319000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f0329a00000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f0331d2c000)
z-anderson commented 1 week ago

My output is

 ldd cpp/build/releases/libvelox.so 
    linux-vdso.so.1 (0x00007ffd3ad8c000)
    libgluten.so => /home/circleci/project/cpp/build/releases/libgluten.so (0x00007f828ab81000)
    libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f828ab52000)
    libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f828ab4c000)
    libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f828a9fd000)
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f828a80b000)
    /lib64/ld-linux-x86-64.so.2 (0x00007f8291733000)
FelixYBW commented 1 week ago

So folly is already static link into your libvelox. No dependency to it.

z-anderson commented 1 week ago

Right now we're using a jar with the third-party libraries (protobuf, arrow, thrift, and snappy), and it works. We get the .so files for these libraries from building it dynamically.

PHILO-HE commented 1 week ago

Arrow and protobuf are missing links in libvelox.so (I can see it in nm -u output). We don't have any customization.

@z-anderson, it should not be an issue. Lib arrow and protobuf are statically linked (enable_vcpkg=ON) to libgluten.so, then libgluten.so is dynamically linked to libvelox.so. For libvelox.so, those arrow/protobuf related undefined symbols will be resolved at runtime by dynamic linker. So there will be no issue.

For lib folly, we expect it's statically linked to libvelox.so when enable_vcpkg is ON. If nm -u shows undefined symbols of folly, it's possible that your build env. contains a shared folly lib that linked to libvelox.so.

As Hongze mentioned above, with enable_vcpkg=ON, libgluten.so & libvelox.so are only needed, which have been packed into gluten jar for user to easily deploy. And we don't need to create/deploy another jar with third-party libs packed.

z-anderson commented 4 days ago

Thank you. Could you please share the full command you use with --enable_vcpkg=ON?

The error I'm getting now is gluten-17219265314440258101/libgluten-x86_64.so: undefined symbol: _ZTIN6apache6thrift8protocol9TProtocolE . I can see in nm -u that thrift isn't statically linked to libvelox.so.

PHILO-HE commented 4 days ago

@z-anderson, --enable_vcpkg=ON is enough. You can set --enable_ep_cache=ON, --build_arrow=OFF for second build to reduce redundant build, but it is not related to your issue.

The error I'm getting now is gluten-17219265314440258101/libgluten-x86_64.so: undefined symbol: _ZTIN6apache6thrift8protocol9TProtocolE . I can see in nm -u that thrift isn't statically linked to libvelox.so.

I guess it is because your build environment has shared thrift lib installed. But we expect static thrift lib is linked. You can uninstall this shared lib if you can. Another way is you can try to build gluten in docker, as Hongze mentioned above

BTW, could you tell us which company you are working for?