Open aschmu opened 7 years ago
@aschmu The conda packages for Arrow are arrow-cpp
and pyarrow
(see https://github.com/conda-forge/arrow-cpp-feedstock and https://github.com/conda-forge/pyarrrow-feedstock) and for PIP pyarrow
. This is sadly a name collision between the python date package arrow
and the Apache Arrow project.
Sorry to hear of your build issues. Could you please provide the following outputs:
> which python
> pip freeze
> echo $VIRTUAL_ENV
> echo $PYTHON
> echo $PYTHON_INCLUDE_DIR
>>> import numpy
>>> print(numpy.get_include())
Hello, so here we go :
> which python
/.conda/envs/py35/bin/python
> pip freeze
cycler==0.10.0
decorator==4.0.11
ipykernel==4.5.2
ipython==5.3.0
ipython-genutils==0.1.0
jupyter-client==5.0.0
jupyter-core==4.3.0
matplotlib==2.0.0
nltk==3.2.2
numpy==1.12.0
pandas==0.19.2
patsy==0.4.1
pexpect==4.2.1
pickleshare==0.7.4
prompt-toolkit==1.0.9
ptyprocess==0.5.1
pyarrow==0.4.1
pybind11==2.1.1
Pygments==2.2.0
pyparsing==2.2.0
python-dateutil==2.6.0
pytz==2016.10
pyzmq==16.0.2
scikit-learn==0.18.1
scipy==0.19.0
seaborn==0.7.1
simplegeneric==0.8.1
six==1.10.0
tornado==4.4.2
traitlets==4.3.2
wcwidth==0.1.7
> echo $VIRTUAL_ENV
[empty]
> echo $PYTHON
[empty]
> echo $PYTHON_INCLUDE_DIR
[empty]
>>> import numpy
>>> print(numpy.get_include())
/home/uxxxx/.conda/envs/py35/lib/python3.5/site-packages/numpy/core/include
Ok, now I realize that you are using conda
and not virtualenv
to build an environment. That's alright, but the script that detects where NumPy is located does not support that out of the box.
Please try the following: In your conda environment, please
export VIRTUAL_ENV=/home/uxxxx/.conda/envs/py35
before calling cmake
. This should provide the necessary pointer to find NumPy, and proceed with the source build.
@MathMagique CONDA_PREFIX
is the equivalent of VIRTUAL_ENV
. We should also use that.
See: https://github.com/blue-yonder/turbodbc/pull/105
I can have a look later this week on how to develop with conda. Currently the conda-forge packages of turbodbc
are based on python sdist and are not directly produced from the git sources.
That variable was what I was looking for
Pull request #105 was merged. So you could just update to the latest master and retry the build without exporting any VIRTUAL_ENV
variable.
@aschmu Can you confirm that this resolves your issue?
@MathMagique So, I tried a fresh rebuild from the latest master and while numpy headers were found, I noticed there was a similar issue with arrow. I didn't have too much time to investigate and add a conda prefix here : arrow.cmake as I'm currently out of the office. I'll have a look at it again.
However something that confused me is that there was another issue where the user rebuilt arrow from scratch prior to building turbodbc. Is that necessary ?
Ok, maybe @xhochy could have another look at the arrow script?
Building arrow from scratch should not be necessary. I don't do it on my local machine, and continuous builds on Linux and OSX don't do it either. pip install pyarrow
should be sufficient for local development.
I've added the CONDA_PREFIX
to FindArrow.cmake
and as well added documentation on how to develop using conda: https://github.com/blue-yonder/turbodbc/pull/106
Verified locally that this works locally on macOS.
@MathMagique this seems to be the same problem as I had, except that @aschmu is using F24 rather than F25.
Just that I don't get confused:
pip install turbodbc
on Fedora 24 with no error message posted yet.@yaxxie, @aschmu: Could one of you guys please check that---with the latest additions to master---the cmake build works without extra effort on your side on your Fedora machines?
Hi @MathMagique, @xhochy thanks for putting in the effort to help us build the library. Cmake build now works as is with a conda environment on Fedora 24 :+1: .
But make failed, though I suspect it's not necessarily a turbodbc issue.
usr/bin/ld: warning: libboost_chrono.so.1.64.0, needed by /home/uxxxx/.conda/envs/py35/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.8.5/../../../../lib/libboost_locale.so, not found (try using -rpath or -rpath-link)
/usr/bin/ld: warning: libboost_thread.so.1.64.0, needed by /home/uxxxx/.conda/envs/py35/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.8.5/../../../../lib/libboost_locale.so, not found (try using -rpath or -rpath-link)
/usr/bin/ld: warning: libboost_system.so.1.64.0, needed by /home/uxxxx/.conda/envs/py35/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.8.5/../../../../lib/libboost_locale.so, not found (try using -rpath or -rpath-link)
/usr/bin/ld: warning: libicudata.so.58, needed by /home/uxxxx/.conda/envs/py35/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.8.5/../../../../lib/libboost_locale.so, not found (try using -rpath or -rpath-link)
/usr/bin/ld: warning: libicui18n.so.58, needed by /home/uxxxx/.conda/envs/py35/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.8.5/../../../../lib/libboost_locale.so, not found (try using -rpath or -rpath-link)
/usr/bin/ld: warning: libicuuc.so.58, needed by /home/uxxxx/.conda/envs/py35/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.8.5/../../../../lib/libboost_locale.so, not found (try using -rpath or -rpath-link)
However the libraries seem to be present (1st one is a symlink)
[uxxxx]$ sudo find / -name "libboost_locale.so.1.64.0"
/home/uxxxx/.conda/envs/py35/lib/libboost_locale.so.1.64.0
/home/uxxxx/.conda/envs/.pkgs/boost-cpp-1.64.0-1/lib/libboost_locale.so.1.64.0
the other make errors seem to be as a result of this linkage difficulty.
I'll try to setup a local Fedora VM in the near future to do some debugging. Thanks for reporting back!
@MathMagique I was never building with conda, only pip, so the patch in #106 cannot help here. I tried looking at the cmake files myself and the only thing I really understand is that it fails at building arrow tests with link errors, but I cannot understand why. I did also try fudging the cmake script and to be honest it looks like all libs and headers are picked up, and it looks more like an issue with the linker command for arrow test, but I couldn't figure out what.
Thanks for the clarification. I suspected I may have become confused ;-)
@MathMagique Installing from pip actually worked (though I think gcc 4.8.x was needed for me), but trying to import connect
from turbodbc produced the following error : "GLIB_CXX3.4.20 not found".
After a little SO research today, a fix that works for me is symlinking /home/uxxxx/.conda/envs/py35/lib/libstdc++.so.6
to the system shared object : in my case it's located in /usr/lib64/libstdc++.so.6
.
As far as building from scratch in a conda env, I haven't retried yet, make
still fails ('cmake` succeeds though).
@MathMagique just to be clear in few words
I tried to cmake build with only a virtualenv and pip install pyarrow
This fails on F25 with the link error on arrow TEST module
Having an installed copy of arrow and exporting PKG_CONFIG_PATH=bla
before building works successfully (this finds the package properly at https://github.com/blue-yonder/turbodbc/blob/master/cmake_scripts/FindArrow.cmake#L37)
Hi,
I'm trying to build turbodbc for Fedora 24 from scratch as pip installing didn't cut it (I'm getting undefined symbols maybe due to my system having gcc 6.2.1n, I'm switching to gcc 4.8.x). However during the build process I'm getting a
no package 'arrow' found
. I did pip install arrow in the conda environment I created for the build but to no avail. Moreover cmake is complaning about an unset variable : Numpy_INCLUDE_DIR, though numpy is also installed in the virtualenv. I followed the instructions on this page and maybe I'm missing something as I'm not an expert in building packages from scratch.