apache / arrow

Apache Arrow is the universal columnar format and multi-language toolbox for fast data interchange and in-memory analytics
https://arrow.apache.org/
Apache License 2.0
14.51k stars 3.53k forks source link

[Python] build python lib failed on both X86 and ARMv8 #20352

Open asfimport opened 2 years ago

asfimport commented 2 years ago

I want to build pyarrow lib in ARM platform. I download pyarrow source code version 8.0.0 and run "python setup.py install". An error occur:

Using ld linker Configured for RELEASE build (set with cmake ~~DCMAKE_BUILD_TYPE={release,debug,...}) -~~ Build Type: RELEASE – Generator: Unix Makefiles – Build output directory: /root/build/pyarrow-8.0.0/build/temp.linux-x86_64-3.6/release – Found Python3: /root/anaconda3/envs/py36test/bin/python (found version "3.6.13") found components: Interpreter Development.Module NumPy  – Found Python3Alt: /root/anaconda3/envs/py36test/bin/python   – Found PkgConfig: /usr/bin/pkg-config (found version "0.27.1")  – Could NOT find Arrow (missing: Arrow_DIR) – Checking for module 'arrow' –   No package 'arrow' found CMake Error at /usr/local/share/cmake-3.23/Modules/FindPackageHandleStandardArgs.cmake:230 (message):   Could NOT find Arrow (missing: ARROW_INCLUDE_DIR ARROW_LIB_DIR   ARROW_FULL_SO_VERSION ARROW_SO_VERSION) Call Stack (most recent call first):   /usr/local/share/cmake-3.23/Modules/FindPackageHandleStandardArgs.cmake:594 (_FPHSA_FAILURE_MESSAGE)   cmake_modules/FindArrow.cmake:450 (find_package_handle_standard_args)   cmake_modules/FindArrowPython.cmake:46 (find_package)   CMakeLists.txt:231 (find_package)

– Configuring incomplete, errors occurred!

 

This error always occur no matter what version I choose(pyarrow 8.0.0 or 2.0.0) and no matter what platform (X86 or ARM c complier) I choose. When I downloaded arrow source code and enter python folder and run "python setup.py install" the same error occur.

It seems that it's an bug on cmake files. I could not build python lib for my ARM plarform.

Environment: os: centos 7.9 CPU: X86_64 Reporter: chendan Watchers: Rok Mihevc / @rok

Original Issue Attachments:

Note: This issue was originally created as ARROW-17265. Please see the migration documentation for further details.

asfimport commented 2 years ago

Rok Mihevc / @rok: Did you build the C++ library too?

Build instructions are here: https://arrow.apache.org/docs/developers/python.html#build-and-test

asfimport commented 2 years ago

chendan: @rok I followed this step. However when I perform cmake it was still failed:

And I attach the log files.

CMakeError.log

CMake Warning at cmake_modules/FindSnappyAlt.cmake:25 (find_package):   By not providing "FindSnappy.cmake" in CMAKE_MODULE_PATH this project has   asked CMake to find a package configuration file provided by "Snappy", but   CMake did not find one.

  Could not find a package configuration file provided by "Snappy" with any   of the following names:

    SnappyConfig.cmake     snappy-config.cmake

  Add the installation prefix of "Snappy" to CMAKE_PREFIX_PATH or set   "Snappy_DIR" to a directory containing one of the above files.  If "Snappy"   provides a separate development package or SDK, be sure it has been   installed. Call Stack (most recent call first):   cmake_modules/ThirdpartyToolchain.cmake:267 (find_package)   cmake_modules/ThirdpartyToolchain.cmake:1136 (resolve_dependency)   CMakeLists.txt:575 (include)

CMake Error at /usr/local/share/cmake-3.23/Modules/FindPackageHandleStandardArgs.cmake:230 (message):   Could NOT find SnappyAlt (missing: Snappy_LIB Snappy_INCLUDE_DIR) Call Stack (most recent call first):   /usr/local/share/cmake-3.23/Modules/FindPackageHandleStandardArgs.cmake:594 (_FPHSA_FAILURE_MESSAGE)   cmake_modules/FindSnappyAlt.cmake:89 (find_package_handle_standard_args)   cmake_modules/ThirdpartyToolchain.cmake:267 (find_package)   cmake_modules/ThirdpartyToolchain.cmake:1136 (resolve_dependency)   CMakeLists.txt:575 (include)

 

 

asfimport commented 2 years ago

Rok Mihevc / @rok: It looks like you're missing Snappy.

Did you install dependencies? https://arrow.apache.org/docs/developers/python.html#using-conda if you're using conda or https://arrow.apache.org/docs/developers/python.html#using-system-and-bundled-dependencies if you're using Homebrew.

asfimport commented 2 years ago

chendan: @rok 

Thanks! I did cmake command successfully in X86 platform. Then I did compiling with ARMv8 c compiler. After I set the c compiler and cxx compiler to be my own arm complier and did cmake,  an error occured:

Error: unknown architecture `nocona'

I know that the CMAKE_C_FLAGS and CMAKE_CXX_FLAGS should be set in cmake command. When I set it, the cmake is succuessful with armv8 cross compier.

Then I run "make -j4". It was failed:

(pyarrow-dev) [root@localhost build]# make -j4 [  0%] Built target toolchain [  0%] Performing configure step for 'jemalloc_ep' [  4%] Built target arrow_dataset_objlib CMake Error at /root/build/arrow/cpp/build/jemalloc_ep-prefix/src/jemalloc_ep-stamp/jemalloc_ep-configure-DEBUG.cmake:37 (message):   Command failed: 77

   './configure' 'AR=/opt/aarch64-kedacom-linux/bin/aarch64-kedacom-linux-gnu-ar' 'CC=/opt/aarch64-kedacom-linux/bin/aarch64-kedacom-linux-gnu-gcc' '{}prefix=/root/build/arrow/cpp/build/jemalloc_ep-prefix/src/jemalloc_ep/dist/' '{}{}with-jemalloc-prefix=jearrow' '{}{}with-private-namespace=je_arrowprivate' '{}{}without-export' '{}{}disable-shared' '{}{}disable-cxx' '{}{}disable-libdl' '{-}-disable-initial-exec-tls'

  See also

    /root/build/arrow/cpp/build/jemalloc_ep-prefix/src/jemalloc_ep-stamp/jemalloc_ep-configure-*.log

– stdout output is: checking for xsltproc... /usr/bin/xsltproc 1 checking for x86_64-conda-linux-gnu-gcc... 2 /opt/aarch64-kedacom-linux/bin/aarch64-kedacom-linux-gnu-gcc write conftest checking whether the C compiler works... no

– stderr output is: configure: error: in /root/build/arrow/cpp/build/jemalloc_ep-prefix/src/jemalloc_ep': configure: error: C compiler cannot create executables Seeconfig.log' for more details

CMake Error at /root/build/arrow/cpp/build/jemalloc_ep-prefix/src/jemalloc_ep-stamp/jemalloc_ep-configure-DEBUG.cmake:47 (message):   Stopping after outputting logs.

make[2]: *** [CMakeFiles/jemalloc_ep.dir/build.make:92: jemalloc_ep-prefix/src/jemalloc_ep-stamp/jemalloc_ep-configure] Error 1 make[1]: *** [CMakeFiles/Makefile2:725: CMakeFiles/jemalloc_ep.dir/all] Error 2 make: *** [Makefile:146: all] Error 2

Can I just set -DARROW_JEMALLOC=OFF to not compile jemalloc? If I do not compile it, will it affect the building of pyarrow libs?

I checked the configure file, the checking command is : $CC -o conftest$ac_exeext $CFLAGS $CPPFLAGS $LDFLAGS conftest.$ac_ext $LIBS >&5. There should be some error in these variable. I add "echo" into the configure file. The CC is right. The CFLAGS is wrong:

-march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O2 -ffunction-sections -pipe -isystem /root/anaconda3/envs/pyarrow-dev/include

The "-march=nocona -mtune=haswell " is wrong again for my arm cross compiler. How to set it correctly for the whole "make -j4" command? What kind of config parameters should be set?

Thanks!

asfimport commented 2 years ago

Rok Mihevc / @rok: [~atptour2017] I've never transpiled Arrow so I can't really speak to it.

To be clear - you're on CentOS 7.9 and x86_64? I'm not sure about compatibility with arm64. @cyb70289 would know more.

As for jemalloc - if it's not available for your platform you can try with mimalloc, see available cmake options here: https://arrow.apache.org/docs/developers/cpp/building.html#optional-components

 

asfimport commented 2 years ago

chendan: @rok 

This problem has been solved with exporting CFLAGS. However I have another question. Can I set -DARROW_WITH_SNAPPY=OFF? In https://arrow.apache.org/docs/developers/python.html#using-conda it's ON. As the snappy shared lib is needed for linking but I found out that this lib is very difficult to build.

asfimport commented 2 years ago

Rok Mihevc / @rok: [~atptour2017] I'm not sure if Python build requires Snappy, if yes then you'll need it.

What error are you getting now? Another option is to try a bundled build, see here: https://arrow.apache.org/docs/developers/cpp/building.html#build-dependency-management

asfimport commented 2 years ago

Rok Mihevc / @rok: [~atptour2017] It seems like a boost issue. If I understand correctly you're using conda. Did you conda activate pyarrow-dev and export ARROW_HOME=$CONDA_PREFIX?

asfimport commented 2 years ago

chendan: @rok 

I run make -j4 in cpp/build. I found that some libs were built out:

libarrow.a                       libarrow_python.a   libarrow_python.so.200      libarrow.so      libarrow.so.200.0.0 libarrow_bundled_dependencies.a  libarrow_python.so  libarrow_python.so.200.0.0  libarrow.so.200  libparquet.a

However make command was not finished successfully:

/opt/aarch64-kedacom-linux/lib/gcc/aarch64-kedacom-linux-gnu/8.3.0/../../../../aarch64-kedacom-linux-gnu/bin/ld: /root/anaconda3/envs/pyarrow-dev/lib/libthrift.so: error adding symbols: file in wrong format

This is because the libthrift.so has not been built out by my ARM cross-complier. Are the libs list upon enough for building python libs? If they are enough, I do not need to build thrift.

 

I have tried it. An error occured:

(pyarrow-dev) [root@localhost python]# python setup.py install Traceback (most recent call last):   File "setup.py", line 634, in     url='https://arrow.apache.org/'   File "/root/anaconda3/envs/pyarrow-dev/lib/python3.6/site-packages/setuptools/{}init{}.py", line 153, in setup     return distutils.core.setup(**attrs)   File "/root/anaconda3/envs/pyarrow-dev/lib/python3.6/distutils/core.py", line 108, in setup     _setupdistribution = dist = klass(attrs)   File "/root/anaconda3/envs/pyarrow-dev/lib/python3.6/site-packages/setuptools/dist.py", line 457, in {}init{}     for k, v in attrs.items()   File "/root/anaconda3/envs/pyarrow-dev/lib/python3.6/distutils/dist.py", line 281, in {}init{}_     self.finalize_options()   File "/root/anaconda3/envs/pyarrow-dev/lib/python3.6/site-packages/setuptools/dist.py", line 830, in finalize_options     for ep in sorted(loaded, key=by_order):   File "/root/anaconda3/envs/pyarrow-dev/lib/python3.6/site-packages/setuptools/dist.py", line 829, in     loaded = map(lambda e: e.load(), filtered)   File "/root/anaconda3/envs/pyarrow-dev/lib/python3.6/site-packages/pkgresources/{}init{}_.py", line 2450, in load     return self.resolve()   File "/root/anaconda3/envs/pyarrow-dev/lib/python3.6/site-packages/pkgresources/{}init{}.py", line 2456, in resolve     module = {}import{}_(self.module_name, fromlist=['name'], level=0)   File "/root/anaconda3/envs/pyarrow-dev/lib/python3.6/site-packages/setuptoolsscm/{}init{}.py", line 5     from {}future{}_ import annotations     ^ SyntaxError: future feature annotations is not defined

 

I google it. It seems that I need to use python3.7 version to run setup.py. But I need to use python3.6. How to solve it?