conda-forge / pyarrow-feedstock

A conda-smithy repository for pyarrow.
BSD 3-Clause "New" or "Revised" License
6 stars 26 forks source link

Updating pyarrow from conda-forge causing errors on Mac py27 #20

Closed data-steve closed 7 years ago

data-steve commented 7 years ago
$ conda install pyarrow -c conda-forge
Fetching package metadata ...........
Solving package specifications: .

Package plan for installation in environment /Users/steve/anaconda:

The following packages will be SUPERCEDED by a higher-priority channel:

    conda: 4.3.14-py27_0 --> 4.2.13-py27_0 conda-forge

Proceed ([y]/n)? y

Then checking installed

$ python
Python 2.7.12 |Anaconda custom (x86_64)| (default, Jul  2 2016, 17:43:17) 
[GCC 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2336.11.00)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
Anaconda is brought to you by Continuum Analytics.
Please check out: http://continuum.io/thanks and https://anaconda.org
>>> import pyarrow.parquet as pa
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "pyarrow/__init__.py", line 28, in <module>
    import pyarrow.config
ImportError: No module named config

Then checking folder for config

$ ls .../anaconda/lib/python2.7/site-packages/pyarrow

__init__.py     compat.py       filesystem.pyc      jemalloc.pyx        scalar.pyx      table_api.h
__init__.pyc        compat.pyc      formatting.py       libpyarrow.dylib    scalar.so       tests
_parquet.pxd        config.pyx      formatting.pyc      memory.pxd      schema.pxd      util.py
_parquet.pyx        config.so       io.pxd          memory.pyx      schema.pyx      util.pyc
_parquet.so     error.pxd       io.pyx          memory.so       schema.so
array.pxd       error.pyx       io.so           parquet.py      table.pxd
array.pyx       error.so        ipc.py          parquet.pyc     table.pyx
array.so        filesystem.py       ipc.pyc         scalar.pxd      table.so

If I try to import pyarrow generally get different warning:

$ python

import pyarrow
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/steve/anaconda/lib/python2.7/site-packages/pyarrow/__init__.py", line 20, in <module>
    from pkg_resources import get_distribution, DistributionNotFound
  File "/Users/steve/anaconda/lib/python2.7/site-packages/setuptools-27.2.0-py2.7.egg/pkg_resources/__init__.py", line 21, in <module>
    try:
ImportError: dlopen(./io.so, 2): Symbol not found: _pyarrow_ARRAY_API
  Referenced from: /Users/steve/anaconda/lib/python2.7/site-packages/pyarrow/libpyarrow.dylib
  Expected in: flat namespace
 in /Users/steve/anaconda/lib/python2.7/site-packages/pyarrow/libpyarrow.dylib
data-steve commented 7 years ago

In case conda info helps:

Current conda install:

               platform : osx-64
          conda version : 4.2.13
       conda is private : False
      conda-env version : 4.2.13
    conda-build version : 2.0.2
         python version : 2.7.12.final.0
       requests version : 2.13.0
       root environment : /Users/steve/anaconda  (writable)
    default environment : /Users/steve/anaconda
       envs directories : /Users/steve/anaconda/envs
          package cache : /Users/steve/anaconda/pkgs
           channel URLs : https://repo.continuum.io/pkgs/free/osx-64
                          https://repo.continuum.io/pkgs/free/noarch
                          https://repo.continuum.io/pkgs/pro/osx-64
                          https://repo.continuum.io/pkgs/pro/noarch
            config file : None
           offline mode : False
data-steve commented 7 years ago

I also tried to build from source, but I could never figure out how to get the parquet-cpp files to download, build and install from the github readme:

Selected compiler clang 
Using dynamic linking for DEBUG builds
Using ld linker
-- Could not find the Parquet library. Looked in  system search paths.
-- Found the Arrow core library: /Users/steve/local/lib/libarrow.dylib
-- Found the Arrow IO library: /Users/steve/local/lib/libarrow_io.dylib
-- Found the Arrow IPC library: /Users/steve/local/lib/libarrow_ipc.dylib
-- Found the Arrow jemalloc library: /Users/steve/local/lib/libarrow_jemalloc.dylib
-- Added shared library dependency arrow: /Users/steve/local/lib/libarrow.dylib
-- Added shared library dependency arrow_io: /Users/steve/local/lib/libarrow_io.dylib
-- Added shared library dependency arrow_ipc: /Users/steve/local/lib/libarrow_ipc.dylib
CMake Error at CMakeLists.txt:464 (message):
  Unable to locate Parquet libraries
data-steve commented 7 years ago

I had to do a fresh install of Anaconda,

But as I remember, the error was introduced after updating pyarrow was followed by this

The following packages will be SUPERCEDED by a higher-priority channel:

    conda:       4.3.14-py27_0                    --> 4.2.13-py27_0 conda-forge
    conda-env:   2.6.0-0                          --> 2.6.0-0       conda-forge

Proceed ([y]/n)? y

specifically the conda-env: 2.6.0-0 --> 2.6.0-0 conda-forge

xhochy commented 7 years ago

To build parquet-cpp correctly from source so that you can use it in pyarrow, you need to specify -DPARQUET_ARROW=ON in the cmake call of parquet-cpp.

To debug this problem a bit better, can you post the output of otool -L .../pyarrow/_parquet.so and otool .../pyarrow/config.so?

data-steve commented 7 years ago

Sure, thanks for your responsiveness

But I ended up doing a complete re-install of anaconda, so everything’s probably changed at this point.

steve$ otool -L /Users/steve/anaconda/lib/python2.7/site-packages/pyarrow/_parquet.so /Users/steve/anaconda/lib/python2.7/site-packages/pyarrow/_parquet.so: /usr/lib/libc++.1.dylib (compatibility version 1.0.0, current version 120.0.0) @rpath/libpyarrow.dylib (compatibility version 0.0.0, current version 0.0.0) @rpath/libarrow.dylib (compatibility version 0.0.0, current version 0.0.0) @rpath/libarrow_io.dylib (compatibility version 0.0.0, current version 0.0.0) @rpath/libarrow_ipc.dylib (compatibility version 0.0.0, current version 0.0.0) @rpath/libparquet_arrow.dylib (compatibility version 0.0.0, current version 0.0.0) /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1197.1.1)

steve$ otool -L /Users/steve/anaconda/lib/python2.7/site-packages/pyarrow/config.so /Users/steve/anaconda/lib/python2.7/site-packages/pyarrow/config.so: /usr/lib/libc++.1.dylib (compatibility version 1.0.0, current version 120.0.0) @rpath/libpyarrow.dylib (compatibility version 0.0.0, current version 0.0.0) @rpath/libarrow.dylib (compatibility version 0.0.0, current version 0.0.0) @rpath/libarrow_io.dylib (compatibility version 0.0.0, current version 0.0.0) @rpath/libarrow_ipc.dylib (compatibility version 0.0.0, current version 0.0.0) @rpath/libparquet_arrow.dylib (compatibility version 0.0.0, current version 0.0.0) /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1197.1.1)

~ Steve

Sent via telepathy

On Mar 22, 2017, at 3:59 AM, Uwe L. Korn notifications@github.com wrote:

To build parquet-cpp correctly from source so that you can use it in pyarrow, you need to specify -DPARQUET_ARROW=ON in the cmake call of parquet-cpp.

To debug this problem a bit better, can you post the output of otool -L .../pyarrow/_parquet.so and otool .../pyarrow/config.so?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/conda-forge/pyarrow-feedstock/issues/20#issuecomment-288324325, or mute the thread https://github.com/notifications/unsubscribe-auth/AHefi33Wxp5NJ9bsEhoDOU0APgXKL7aHks5roNTigaJpZM4MkWyd.

data-steve commented 7 years ago

Also. I thought I did this step:

-DPARQUET_ARROW=ON

But honestly not being a C dev the install instructions left me confused at times. Again, my lack of experience.

~ Steve

Sent via telepathy

On Mar 22, 2017, at 3:59 AM, Uwe L. Korn notifications@github.com wrote:

o build parquet-cpp correctly from source so that you can use it in pyarrow, you need to specify -DPARQUET_ARROW=ON in the cmake call of parquet-cpp.

xhochy commented 7 years ago

The otool output seems fine. After the fresh anaconda install and a fresh creation of the env where you have installed pyarrow, the error still persists?

data-steve commented 7 years ago

Everything is fine now. Thanks!

~ Steve

Sent via telepathy

On Mar 26, 2017, at 4:44 AM, Uwe L. Korn notifications@github.com wrote:

The otool output seems fine. After the fresh anaconda install and a fresh creation of the env where you have installed pyarrow, the error still persists?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.