blue-yonder / turbodbc

Turbodbc is a Python module to access relational databases via the Open Database Connectivity (ODBC) interface. The module complies with the Python Database API Specification 2.0.
http://turbodbc.readthedocs.io/en/latest
MIT License
616 stars 86 forks source link

Building on Fedora 24 #104

Open aschmu opened 7 years ago

aschmu commented 7 years ago

Hi,

I'm trying to build turbodbc for Fedora 24 from scratch as pip installing didn't cut it (I'm getting undefined symbols maybe due to my system having gcc 6.2.1n, I'm switching to gcc 4.8.x). However during the build process I'm getting a no package 'arrow' found. I did pip install arrow in the conda environment I created for the build but to no avail. Moreover cmake is complaning about an unset variable : Numpy_INCLUDE_DIR, though numpy is also installed in the virtualenv. I followed the instructions on this page and maybe I'm missing something as I'm not an expert in building packages from scratch.

-- Check for working CXX compiler: /home/uxgxxx/.conda/envs/py35/bin/c++
-- Check for working CXX compiler: /home/uxgxxx/.conda/envs/py35/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Build type: Debug
-- The C compiler identification is GNU 4.8.5
-- Check for working C compiler: /home/uxgxxx/.conda/envs/py35/bin/cc
-- Check for working C compiler: /home/uxgxxx/.conda/envs/py35/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Found PythonInterp: /home/uxgxxx/.conda/envs/py35/bin/python3.5 (found version "3.5.3") 
-- Found PythonLibs: /home/uxgxxx/.conda/envs/py35/lib/libpython3.5m.so
-- pybind11 v2.1.dev0
-- Boost version: 1.60.0
-- Found the following Boost libraries:
--   locale
-- Detecting unixODBC library
--   Found header files at: /usr/include
--   Found library at: /usr/lib64/libodbc.so
-- Boost version: 1.60.0
-- Found the following Boost libraries:
--   system
--   date_time
--   locale
-- Detecting unixODBC library
--   Found header files at: /usr/include
--   Found library at: /usr/lib64/libodbc.so
-- Boost version: 1.60.0
-- Found the following Boost libraries:
--   system
-- Detecting unixODBC library
--   Found header files at: /usr/include
--   Found library at: /usr/lib64/libodbc.so
-- Performing Test HAS_CPP14_FLAG
-- Performing Test HAS_CPP14_FLAG - Failed
-- Performing Test HAS_CPP11_FLAG
-- Performing Test HAS_CPP11_FLAG - Success
-- Performing Test HAS_FLTO
-- Performing Test HAS_FLTO - Failed
-- LTO disabled (not supported by the compiler and/or linker)
-- Boost version: 1.60.0
-- Found the following Boost libraries:
--   system
-- Detecting unixODBC library
--   Found header files at: /usr/include
--   Found library at: /usr/lib64/libodbc.so
-- Found PkgConfig: /bin/pkg-config (found version "0.29") 
-- Checking for module 'arrow'
--   No package 'arrow' found
-- Could not find the Arrow library. Looked for headers in , and for libs in 
CMake Error: The following variables are used in this project, but they are set to NOTFOUND.
Please set them or make sure they are set and tested correctly in the CMake files:
Numpy_INCLUDE_DIR
used as include directory in directory /xxx/turbodbc/cpp/turbodbc_numpy
 used as include directory in directory /xxx/turbodbc/cpp/turbodbc_numpy
 used as include directory in directory /xxx/turbodbc/cpp/turbodbc_numpy
 used as include directory in directory /xxx/turbodbc/cpp/turbodbc_numpy
 used as include directory in directory /xxx/turbodbc/cpp/turbodbc_numpy
 used as include directory in directory /xxx/turbodbc/cpp/turbodbc_numpy
 used as include directory in directory /xxx/turbodbc/cpp/turbodbc_numpy
 used as include directory in directory /xxx/turbodbc/cpp/turbodbc_numpy/Library
 used as include directory in directory /xxx/turbodbc/cpp/turbodbc_numpy/Library
 used as include directory in directory /xxx/turbodbc/cpp/turbodbc_numpy/Library
 used as include directory in directory /xxx/turbodbc/cpp/turbodbc_numpy/Library
 used as include directory in directory /xxx/turbodbc/cpp/turbodbc_numpy/Library
 used as include directory in directory /xxx/turbodbc/cpp/turbodbc_numpy/Library
 used as include directory in directory /xxx/turbodbc/cpp/turbodbc_numpy/Library
 used as include directory in directory /xxx/turbodbc/cpp/turbodbc_numpy/Library
xhochy commented 7 years ago

@aschmu The conda packages for Arrow are arrow-cpp and pyarrow (see https://github.com/conda-forge/arrow-cpp-feedstock and https://github.com/conda-forge/pyarrrow-feedstock) and for PIP pyarrow. This is sadly a name collision between the python date package arrow and the Apache Arrow project.

MathMagique commented 7 years ago

Sorry to hear of your build issues. Could you please provide the following outputs:

aschmu commented 7 years ago

Hello, so here we go :

> which python
/.conda/envs/py35/bin/python
> pip freeze
cycler==0.10.0
decorator==4.0.11
ipykernel==4.5.2
ipython==5.3.0
ipython-genutils==0.1.0
jupyter-client==5.0.0
jupyter-core==4.3.0
matplotlib==2.0.0
nltk==3.2.2
numpy==1.12.0
pandas==0.19.2
patsy==0.4.1
pexpect==4.2.1
pickleshare==0.7.4
prompt-toolkit==1.0.9
ptyprocess==0.5.1
pyarrow==0.4.1
pybind11==2.1.1
Pygments==2.2.0
pyparsing==2.2.0
python-dateutil==2.6.0
pytz==2016.10
pyzmq==16.0.2
scikit-learn==0.18.1
scipy==0.19.0
seaborn==0.7.1
simplegeneric==0.8.1
six==1.10.0
tornado==4.4.2
traitlets==4.3.2
wcwidth==0.1.7

> echo $VIRTUAL_ENV
[empty]
> echo $PYTHON
[empty]
> echo $PYTHON_INCLUDE_DIR
[empty]
MathMagique commented 7 years ago

Ok, now I realize that you are using conda and not virtualenv to build an environment. That's alright, but the script that detects where NumPy is located does not support that out of the box.

Please try the following: In your conda environment, please

export VIRTUAL_ENV=/home/uxxxx/.conda/envs/py35

before calling cmake. This should provide the necessary pointer to find NumPy, and proceed with the source build.

xhochy commented 7 years ago

@MathMagique CONDA_PREFIX is the equivalent of VIRTUAL_ENV. We should also use that.

xhochy commented 7 years ago

See: https://github.com/blue-yonder/turbodbc/pull/105

I can have a look later this week on how to develop with conda. Currently the conda-forge packages of turbodbc are based on python sdist and are not directly produced from the git sources.

MathMagique commented 7 years ago

That variable was what I was looking for

MathMagique commented 7 years ago

Pull request #105 was merged. So you could just update to the latest master and retry the build without exporting any VIRTUAL_ENV variable.

MathMagique commented 7 years ago

@aschmu Can you confirm that this resolves your issue?

aschmu commented 7 years ago

@MathMagique So, I tried a fresh rebuild from the latest master and while numpy headers were found, I noticed there was a similar issue with arrow. I didn't have too much time to investigate and add a conda prefix here : arrow.cmake as I'm currently out of the office. I'll have a look at it again.

However something that confused me is that there was another issue where the user rebuilt arrow from scratch prior to building turbodbc. Is that necessary ?

MathMagique commented 7 years ago

Ok, maybe @xhochy could have another look at the arrow script?

Building arrow from scratch should not be necessary. I don't do it on my local machine, and continuous builds on Linux and OSX don't do it either. pip install pyarrow should be sufficient for local development.

xhochy commented 7 years ago

I've added the CONDA_PREFIX to FindArrow.cmake and as well added documentation on how to develop using conda: https://github.com/blue-yonder/turbodbc/pull/106

Verified locally that this works locally on macOS.

yaxxie commented 7 years ago

@MathMagique this seems to be the same problem as I had, except that @aschmu is using F24 rather than F25.

MathMagique commented 7 years ago

Just that I don't get confused:

@yaxxie, @aschmu: Could one of you guys please check that---with the latest additions to master---the cmake build works without extra effort on your side on your Fedora machines?

aschmu commented 7 years ago

Hi @MathMagique, @xhochy thanks for putting in the effort to help us build the library. Cmake build now works as is with a conda environment on Fedora 24 :+1: .

But make failed, though I suspect it's not necessarily a turbodbc issue.

usr/bin/ld: warning: libboost_chrono.so.1.64.0, needed by /home/uxxxx/.conda/envs/py35/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.8.5/../../../../lib/libboost_locale.so, not found (try using -rpath or -rpath-link)
/usr/bin/ld: warning: libboost_thread.so.1.64.0, needed by /home/uxxxx/.conda/envs/py35/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.8.5/../../../../lib/libboost_locale.so, not found (try using -rpath or -rpath-link)
/usr/bin/ld: warning: libboost_system.so.1.64.0, needed by /home/uxxxx/.conda/envs/py35/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.8.5/../../../../lib/libboost_locale.so, not found (try using -rpath or -rpath-link)
/usr/bin/ld: warning: libicudata.so.58, needed by /home/uxxxx/.conda/envs/py35/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.8.5/../../../../lib/libboost_locale.so, not found (try using -rpath or -rpath-link)
/usr/bin/ld: warning: libicui18n.so.58, needed by /home/uxxxx/.conda/envs/py35/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.8.5/../../../../lib/libboost_locale.so, not found (try using -rpath or -rpath-link)
/usr/bin/ld: warning: libicuuc.so.58, needed by /home/uxxxx/.conda/envs/py35/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.8.5/../../../../lib/libboost_locale.so, not found (try using -rpath or -rpath-link)

However the libraries seem to be present (1st one is a symlink)

[uxxxx]$ sudo find / -name "libboost_locale.so.1.64.0" 
/home/uxxxx/.conda/envs/py35/lib/libboost_locale.so.1.64.0 
/home/uxxxx/.conda/envs/.pkgs/boost-cpp-1.64.0-1/lib/libboost_locale.so.1.64.0

the other make errors seem to be as a result of this linkage difficulty.

MathMagique commented 7 years ago

I'll try to setup a local Fedora VM in the near future to do some debugging. Thanks for reporting back!

yaxxie commented 7 years ago

@MathMagique I was never building with conda, only pip, so the patch in #106 cannot help here. I tried looking at the cmake files myself and the only thing I really understand is that it fails at building arrow tests with link errors, but I cannot understand why. I did also try fudging the cmake script and to be honest it looks like all libs and headers are picked up, and it looks more like an issue with the linker command for arrow test, but I couldn't figure out what.

MathMagique commented 7 years ago

Thanks for the clarification. I suspected I may have become confused ;-)

aschmu commented 7 years ago

@MathMagique Installing from pip actually worked (though I think gcc 4.8.x was needed for me), but trying to import connect from turbodbc produced the following error : "GLIB_CXX3.4.20 not found". After a little SO research today, a fix that works for me is symlinking /home/uxxxx/.conda/envs/py35/lib/libstdc++.so.6to the system shared object : in my case it's located in /usr/lib64/libstdc++.so.6.

As far as building from scratch in a conda env, I haven't retried yet, make still fails ('cmake` succeeds though).

yaxxie commented 7 years ago

@MathMagique just to be clear in few words

I tried to cmake build with only a virtualenv and pip install pyarrow

This fails on F25 with the link error on arrow TEST module

Having an installed copy of arrow and exporting PKG_CONFIG_PATH=bla before building works successfully (this finds the package properly at https://github.com/blue-yonder/turbodbc/blob/master/cmake_scripts/FindArrow.cmake#L37)