rapidsai / cudf

cuDF - GPU DataFrame Library
https://docs.rapids.ai/api/cudf/stable/
Apache License 2.0
8.44k stars 903 forks source link

[BUG] ModuleNotFoundError: No module named 'pyarrow._orc' #5808

Closed pbrady32 closed 4 years ago

pbrady32 commented 4 years ago

Describe the bug I'm using an HPC cluster at work (CentOS 7.7) that is managed by the SLURM workload manager.

When attempting to import CUDF, I receive the following error:

(cudftest) [pgbrady@gl1004 gpu]$ python
Python 3.6.11 | packaged by conda-forge | (default, Jul 28 2020, 23:15:00)
[GCC 7.5.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
import cudf
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/pgbrady/.conda/envs/cudftest/lib/python3.6/site-packages/cudf/__init__.py", line 35, in <module>
    from cudf.io import (
  File "/home/pgbrady/.conda/envs/cudftest/lib/python3.6/site-packages/cudf/io/__init__.py", line 9, in <module>
    from cudf.io.orc import read_orc, read_orc_metadata, to_orc
  File "/home/pgbrady/.conda/envs/cudftest/lib/python3.6/site-packages/cudf/io/orc.py", line 6, in <module>
    import pyarrow.orc as orc
  File "/home/pgbrady/.local/lib/python3.6/site-packages/pyarrow/orc.py", line 25, in <module>
    import pyarrow._orc as _orc
ModuleNotFoundError: No module named 'pyarrow._orc'

Steps/Code to reproduce bug

Started with fresh Conda env and ran following code. No other commands run to set up environment:

conda install -c rapidsai -c nvidia -c numba -c conda-forge cudf=0.14 python=3.6 cudatoolkit=10.1

Expected behavior cudf should import without error

Environment overview (please complete the following information)

Environment details

Click here to see environment details

     **git***
     Not inside a git repository

     ***OS Information***
     #
     # This file is managed by Ansible.
     #
     # template: /etc/ansible/roles/arcts-release/templates/arcts-release.j2
     #
     arcts-release=2.0
     CentOS Linux release 7.7.1908 (Core)
     NAME="CentOS Linux"
     VERSION="7 (Core)"
     ID="centos"
     ID_LIKE="rhel fedora"
     VERSION_ID="7"
     PRETTY_NAME="CentOS Linux 7 (Core)"
     ANSI_COLOR="0;31"
     CPE_NAME="cpe:/o:centos:centos:7"
     HOME_URL="https://www.centos.org/"
     BUG_REPORT_URL="https://bugs.centos.org/"

     CENTOS_MANTISBT_PROJECT="CentOS-7"
     CENTOS_MANTISBT_PROJECT_VERSION="7"
     REDHAT_SUPPORT_PRODUCT="centos"
     REDHAT_SUPPORT_PRODUCT_VERSION="7"

     CentOS Linux release 7.7.1908 (Core)
     CentOS Linux release 7.7.1908 (Core)
     Linux gl1004.arc-ts.umich.edu 3.10.0-1062.9.1.el7.x86_64 #1 SMP Fri Dec 6 1     5:49:49 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

     ***GPU Information***
     Thu Jul 30 08:26:57 2020
     +--------------------------------------------------------------------------     ---+
     | NVIDIA-SMI 418.39       Driver Version: 418.39       CUDA Version: 10.1          |
     |-------------------------------+----------------------+-------------------     ---+
     | GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. E     CC |
     | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute      M. |
     |===============================+======================+===================     ===|
     |   0  Tesla V100-PCIE...  On   | 00000000:3B:00.0 Off |                         0 |
     | N/A   33C    P0    25W / 250W |      0MiB / 16130MiB |      0%   E. Proce     ss |
     +-------------------------------+----------------------+-------------------     ---+

     +--------------------------------------------------------------------------     ---+
     | Processes:                                                       GPU Memo     ry |
     |  GPU       PID   Type   Process name                             Usage           |
     |==========================================================================     ===|
     |  No running processes found                                                      |
     +--------------------------------------------------------------------------     ---+

     ***CPU***
     Architecture:          x86_64
     CPU op-mode(s):        32-bit, 64-bit
     Byte Order:            Little Endian
     CPU(s):                40
     On-line CPU(s) list:   0-39
     Thread(s) per core:    1
     Core(s) per socket:    20
     Socket(s):             2
     NUMA node(s):          4
     Vendor ID:             GenuineIntel
     CPU family:            6
     Model:                 85
     Model name:            Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz
     Stepping:              4
     CPU MHz:               3099.902
     CPU max MHz:           3700.0000
     CPU min MHz:           1000.0000
     BogoMIPS:              4800.00
     Virtualization:        VT-x
     L1d cache:             32K
     L1i cache:             32K
     L2 cache:              1024K
     L3 cache:              28160K
     NUMA node0 CPU(s):     0,4,8,12,16,20,24,28,32,36
     NUMA node1 CPU(s):     1,5,9,13,17,21,25,29,33,37
     NUMA node2 CPU(s):     2,6,10,14,18,22,26,30,34,38
     NUMA node3 CPU(s):     3,7,11,15,19,23,27,31,35,39
     Flags:                 fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge      mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx p     dpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology      nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est      tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc     _deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch epb cat_l3 c     dp_l3 invpcid_single intel_ppin intel_pt ssbd mba ibrs ibpb stibp tpr_shadow vnm     i flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid      rtm cqm mpx rdt_a avx512f avx512dq rdseed adx smap clflushopt clwb avx512cd avx     512bw avx512vl xsaveopt xsavec xgetbv1 cqm_llc cqm_occup_llc cqm_mbm_total cqm_m     bm_local dtherm ida arat pln pts pku ospke md_clear spec_ctrl intel_stibp flush_     l1d

     ***CMake***
     /usr/bin/cmake
     cmake version 2.8.12.2

     ***g++***
     /sw/arcts/centos7/gcc/9.2.0/bin/g++
     g++ (GCC) 9.2.0
     Copyright (C) 2019 Free Software Foundation, Inc.
     This is free software; see the source for copying conditions.  There is NO
     warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

     ***nvcc***
     /sw/arcts/centos7/cuda/10.1.243/bin/nvcc
     nvcc: NVIDIA (R) Cuda compiler driver
     Copyright (c) 2005-2019 NVIDIA Corporation
     Built on Sun_Jul_28_19:07:16_PDT_2019
     Cuda compilation tools, release 10.1, V10.1.243

     ***Python***
     /home/pgbrady/.conda/envs/cudftest/bin/python
     Python 3.6.11

     ***Environment Variables***
     PATH                            : /sw/arcts/centos7/gcc/9.2.0/bin:/sw/arcts     /centos7/cuda/10.1.243/bin:/home/pgbrady/.conda/envs/cudftest/bin:/sw/arcts/cent     os7/python3.7-anaconda/2019.07/condabin:/opt/TurboVNC/bin:/opt/slurm/bin:/opt/sl     urm/sbin:/usr/lib64/qt-3.3/bin:/sw/arcts/centos7/usertools/bin:/usr/local/bin:/u     sr/bin:/usr/local/sbin:/usr/sbin:/usr/lpp/mmfs/bin:/opt/ibutils/bin:/opt/ddn/ime     /bin:/home/pgbrady/.local/bin:/home/pgbrady/bin
     LD_LIBRARY_PATH                 : /sw/arcts/centos7/gcc/9.2.0/lib64:/sw/arc     ts/centos7/cuda/10.1.243/lib64:/opt/slurm/lib64::
     NUMBAPRO_NVVM                   :
     NUMBAPRO_LIBDEVICE              :
     CONDA_PREFIX                    : /home/pgbrady/.conda/envs/cudftest
     PYTHON_PATH                     :

     ***conda packages***
     /sw/arcts/centos7/python3.7-anaconda/2019.07/condabin/conda
     # packages in environment at /home/pgbrady/.conda/envs/cudftest:
     #
     # Name                    Version                   Build  Channel
     _libgcc_mutex             0.1                 conda_forge    conda-forge
     _openmp_mutex             4.5                       0_gnu    conda-forge
     arrow-cpp                 0.15.0           py36h090bef1_2    conda-forge
     boost-cpp                 1.70.0               h7b93d67_3    conda-forge
     brotli                    1.0.7             he1b5a44_1004    conda-forge
     bzip2                     1.0.8                h516909a_2    conda-forge
     c-ares                    1.16.1               h516909a_0    conda-forge
     ca-certificates           2020.6.20            hecda079_0    conda-forge
     certifi                   2020.6.20        py36h9f0ad1d_0    conda-forge
     cudatoolkit               10.1.243             h6bb024c_0    nvidia
     cudf                      0.14.0                   py36_0    rapidsai
     cudnn                     7.6.0                cuda10.1_0    nvidia
     cupy                      7.6.0            py36h5c369b2_0    conda-forge
     dlpack                    0.3                  he1b5a44_1    conda-forge
     double-conversion         3.1.5                he1b5a44_2    conda-forge
     fastavro                  0.23.6           py36h8c4c3a4_0    conda-forge
     fastrlock                 0.5              py36h831f99a_0    conda-forge
     fsspec                    0.7.4                      py_0    conda-forge
     gflags                    2.2.2             he1b5a44_1004    conda-forge
     glog                      0.4.0                h49b9bf7_3    conda-forge
     grpc-cpp                  1.23.0               h18db393_0    conda-forge
     icu                       67.1                 he1b5a44_0    conda-forge
     ld_impl_linux-64          2.34                 hc38a660_9    conda-forge
     libblas                   3.8.0               17_openblas    conda-forge
     libcblas                  3.8.0               17_openblas    conda-forge
     libcudf                   0.14.0               cuda10.1_0    rapidsai
     libevent                  2.1.10               hcdb4288_1    conda-forge
     libffi                    3.2.1             he1b5a44_1007    conda-forge
     libgcc-ng                 9.3.0               h24d8f2e_11    conda-forge
     libgfortran-ng            7.5.0               hdf63c60_11    conda-forge
     libgomp                   9.3.0               h24d8f2e_11    conda-forge
     liblapack                 3.8.0               17_openblas    conda-forge
     libnvstrings              0.14.0               cuda10.1_0    rapidsai
     libopenblas               0.3.10          pthreads_hb3c22a3_4    conda-forg     e
     libprotobuf               3.8.0                h8b12597_0    conda-forge
     librmm                    0.14.0               cuda10.1_0    rapidsai
     libstdcxx-ng              9.3.0               hdf63c60_11    conda-forge
     llvmlite                  0.33.0           py36hf484d3e_0    numba
     lz4-c                     1.8.3             he1b5a44_1001    conda-forge
     nccl                      2.4.6.1              cuda10.1_0    nvidia
     ncurses                   6.2                  he1b5a44_1    conda-forge
     numba                     0.50.1          np1.11py3.6h04863e7_gbfd9be18f_0         numba
     numpy                     1.19.1           py36h7314795_0    conda-forge
     nvstrings                 0.14.0                   py36_0    rapidsai
     openssl                   1.1.1g               h516909a_1    conda-forge
     pandas                    0.25.3           py36hb3f55d8_0    conda-forge
     parquet-cpp               1.5.1                         2    conda-forge
     pip                       20.2                       py_0    conda-forge
     pyarrow                   0.15.0           py36h8b68381_1    conda-forge
     python                    3.6.11          h425cb1d_1_cpython    conda-forge
     python-dateutil           2.8.1                      py_0    conda-forge
     python_abi                3.6                     1_cp36m    conda-forge
     pytz                      2020.1             pyh9f0ad1d_0    conda-forge
     re2                       2020.04.01           he1b5a44_0    conda-forge
     readline                  8.0                  he28a2e2_2    conda-forge
     rmm                       0.14.0                   py36_0    rapidsai
     setuptools                49.2.0           py36h9f0ad1d_0    conda-forge
     six                       1.15.0             pyh9f0ad1d_0    conda-forge
     snappy                    1.1.8                he1b5a44_3    conda-forge
     spdlog                    1.7.0                hc9558a2_0    conda-forge
     sqlite                    3.32.3               hcee41ef_1    conda-forge
     thrift-cpp                0.12.0            hf3afdfd_1004    conda-forge
     tk                        8.6.10               hed695b0_0    conda-forge
     uriparser                 0.9.3                he1b5a44_1    conda-forge
     wheel                     0.34.2                     py_1    conda-forge
     xz                        5.2.5                h516909a_1    conda-forge
     zlib                      1.2.11            h516909a_1006    conda-forge
     zstd                      1.4.4                h3b9ef0a_2    conda-forge

Additional context When I run the pythoncommand it seems odd that [GCC 7.5.0] on linux is presented on startup of the Python prompt.

kkraus14 commented 4 years ago

@pbrady32 it looks like you have another installed pyarrow so it's not using the one in your conda environment: /home/pgbrady/.local/lib/python3.6/site-packages/pyarrow/

I'd expect to see /home/pgbrady/.conda/envs/cudftest/lib/python3.6/site-packages/pyarrow/ instead if it was using the one from your conda environment.

Maybe try purging it with pip uninstall pyarrow and then using conda to reinstall it?

kkraus14 commented 4 years ago

Noted in community slack that this is resolved by my message above.