rapidsai / cudf

cuDF - GPU DataFrame Library
https://docs.rapids.ai/api/cudf/stable/
Apache License 2.0
8.45k stars 903 forks source link

[BUG] python build of cudf fails with error compiling Cython file "not allowed without gil" #4376

Closed saltylamon closed 4 years ago

saltylamon commented 4 years ago

[-- ENV --] Debian 10 Python 3.8
Cuda 10 Cython 0.29.15
scipy 1.4.1
cmake 3.13.4

/cudf/python/cudf# python3 setup.py build_ext --inplace

Compiling cudf/_lib/arrow/_cuda.pyx because it changed. Compiling cudf/_lib/avro.pyx because it changed. Compiling cudf/_lib/binops.pyx because it changed. Compiling cudf/_lib/concat.pyx because it changed. Compiling cudf/_lib/copying.pyx because it changed. Compiling cudf/_lib/csv.pyx because it changed. Compiling cudf/_lib/cudf.pyx because it changed. Compiling cudf/_lib/dlpack.pyx because it changed. Compiling cudf/_lib/filling.pyx because it changed. Compiling cudf/_lib/gpuarrow.pyx because it changed. Compiling cudf/_lib/groupby.pyx because it changed. Compiling cudf/_lib/hash.pyx because it changed. Compiling cudf/_lib/issorted.pyx because it changed. Compiling cudf/_lib/join.pyx because it changed. Compiling cudf/_lib/nvtx.pyx because it changed. Compiling cudf/_lib/orc.pyx because it changed. Compiling cudf/_lib/quantile.pyx because it changed. Compiling cudf/_lib/reduce.pyx because it changed. Compiling cudf/_lib/replace.pyx because it changed. Compiling cudf/_lib/reshape.pyx because it changed. Compiling cudf/_lib/rolling.pyx because it changed. Compiling cudf/_lib/search.pyx because it changed. Compiling cudf/_lib/sort.pyx because it changed. Compiling cudf/_lib/stream_compaction.pyx because it changed. Compiling cudf/_lib/table.pyx because it changed. Compiling cudf/_lib/transpose.pyx because it changed. Compiling cudf/_lib/typecast.pyx because it changed. Compiling cudf/_lib/unaryops.pyx because it changed. Compiling cudf/_lib/utils.pyx because it changed. Compiling cudf/_libxx/aggregation.pyx because it changed. Compiling cudf/_libxx/arrow/_cuda.pyx because it changed. Compiling cudf/_libxx/avro.pyx because it changed. Compiling cudf/_libxx/column.pyx because it changed. Compiling cudf/_libxx/copying.pyx because it changed. Compiling cudf/_libxx/dlpack.pyx because it changed. Compiling cudf/_libxx/gpuarrow.pyx because it changed. Compiling cudf/_libxx/hash.pyx because it changed. Compiling cudf/_libxx/io/utils.pyx because it changed. Compiling cudf/_libxx/join.pyx because it changed. Compiling cudf/_libxx/json.pyx because it changed. Compiling cudf/_libxx/merge.pyx because it changed. Compiling cudf/_libxx/null_mask.pyx because it changed. Compiling cudf/_libxx/orc.pyx because it changed. Compiling cudf/_libxx/parquet.pyx because it changed. Compiling cudf/_libxx/quantiles.pyx because it changed. Compiling cudf/_libxx/reduce.pyx because it changed. Compiling cudf/_libxx/replace.pyx because it changed. Compiling cudf/_libxx/reshape.pyx because it changed. Compiling cudf/_libxx/rolling.pyx because it changed. Compiling cudf/_libxx/scalar.pyx because it changed. Compiling cudf/_libxx/search.pyx because it changed. Compiling cudf/_libxx/sort.pyx because it changed. Compiling cudf/_libxx/stream_compaction.pyx because it changed. Compiling cudf/_libxx/string_casting.pyx because it changed. Compiling cudf/_libxx/strings/char_types.pyx because it changed. Compiling cudf/_libxx/strings/replace.pyx because it changed. Compiling cudf/_libxx/strings/substring.pyx because it changed. Compiling cudf/_libxx/strings/wrap.pyx because it changed. Compiling cudf/_libxx/table.pyx because it changed. Compiling cudf/_libxx/transform.pyx because it changed. Compiling cudf/_libxx/transpose.pyx because it changed. Compiling cudf/_libxx/types.pyx because it changed. Compiling cudf/_libxx/unary.pyx because it changed. [ 1/63] Cythonizing cudf/_lib/arrow/_cuda.pyx

Error compiling Cython file:

...

flake8: noqa

from future import absolute_import

from pyarrow.lib cimport * ^

cudf/_lib/arrow/_cuda.pxd:22:0: 'pyarrow/lib.pxd' not found

Error compiling Cython file:

...

flake8: noqa

from future import absolute_import

from pyarrow.lib cimport from pyarrow.includes.common cimport ^

cudf/_lib/arrow/_cuda.pxd:23:0: 'pyarrow/includes/common.pxd' not found

Error compiling Cython file:

...

from future import absolute_import

from pyarrow.lib cimport from pyarrow.includes.common cimport from pyarrow.includes.libarrow cimport * ^

cudf/_lib/arrow/_cuda.pxd:24:0: 'pyarrow/includes/libarrow.pxd' not found

Error compiling Cython file:

...

cdef extern from "arrow/gpu/cuda_api.h" namespace "arrow::cuda" nogil:

cdef cppclass CCudaDeviceManager" arrow::cuda::CudaDeviceManager":
    @staticmethod
    CStatus GetInstance(CCudaDeviceManager** manager)
   ^

cudf/_lib/arrow/libarrow_cuda.pxd:26:8: 'CStatus' is not a type identifier

Error compiling Cython file:

... cdef extern from "arrow/gpu/cuda_api.h" namespace "arrow::cuda" nogil:

cdef cppclass CCudaDeviceManager" arrow::cuda::CudaDeviceManager":
    @staticmethod
    CStatus GetInstance(CCudaDeviceManager** manager)
    CStatus GetContext(int gpu_number, shared_ptr[CCudaContext]* ctx)
   ^

cudf/_lib/arrow/libarrow_cuda.pxd:27:8: 'CStatus' is not a type identifier

Error compiling Cython file:

... cdef extern from "arrow/gpu/cuda_api.h" namespace "arrow::cuda" nogil:

cdef cppclass CCudaDeviceManager" arrow::cuda::CudaDeviceManager":
    @staticmethod
    CStatus GetInstance(CCudaDeviceManager** manager)
    CStatus GetContext(int gpu_number, shared_ptr[CCudaContext]* ctx)
                                      ^

cudf/_lib/arrow/libarrow_cuda.pxd:27:43: 'shared_ptr' is not a type identifier

Error compiling Cython file:

...

cdef cppclass CCudaDeviceManager" arrow::cuda::CudaDeviceManager":
    @staticmethod
    CStatus GetInstance(CCudaDeviceManager** manager)
    CStatus GetContext(int gpu_number, shared_ptr[CCudaContext]* ctx)
    CStatus GetSharedContext(int gpu_number,
   ^

cudf/_lib/arrow/libarrow_cuda.pxd:28:8: 'CStatus' is not a type identifier

Error compiling Cython file:

... @staticmethod CStatus GetInstance(CCudaDeviceManager* manager) CStatus GetContext(int gpu_number, shared_ptr[CCudaContext] ctx) CStatus GetSharedContext(int gpu_number, void handle, shared_ptr[CCudaContext] ctx) ^

cudf/_lib/arrow/libarrow_cuda.pxd:30:33: 'shared_ptr' is not a type identifier

Error compiling Cython file:

... CStatus GetInstance(CCudaDeviceManager* manager) CStatus GetContext(int gpu_number, shared_ptr[CCudaContext] ctx) CStatus GetSharedContext(int gpu_number, void handle, shared_ptr[CCudaContext] ctx) CStatus AllocateHost(int device_number, int64_t nbytes, ^

cudf/_lib/arrow/libarrow_cuda.pxd:31:8: 'CStatus' is not a type identifier

Error compiling Cython file:

... CStatus GetInstance(CCudaDeviceManager* manager) CStatus GetContext(int gpu_number, shared_ptr[CCudaContext] ctx) CStatus GetSharedContext(int gpu_number, void handle, shared_ptr[CCudaContext] ctx) CStatus AllocateHost(int device_number, int64_t nbytes, ^

cudf/_lib/arrow/libarrow_cuda.pxd:31:48: 'int64_t' is not a type identifier

Error compiling Cython file:

..... ...... it goes on forever with these ....... .....

cudf/_lib/arrow/_cuda.pyx:892:45: Converting to Python object not allowed without gil Traceback (most recent call last): File "setup.py", line 80, in ext_modules=cythonize( File "/usr/local/lib/python3.8/site-packages/Cython/Build/Dependencies.py", line 1101, in cythonize cythonize_one(*args) File "/usr/local/lib/python3.8/site-packages/Cython/Build/Dependencies.py", line 1224, in cythonize_one raise CompileError(None, pyx_file) Cython.Compiler.Errors.CompileError: cudf/_lib/arrow/_cuda.pyx

kkraus14 commented 4 years ago

@saltylamon Python 3.8 is not yet supported, but it looks like all of the errors stem from not finding the pyarrow Cython headers: cudf/_lib/arrow/_cuda.pxd:22:0: 'pyarrow/lib.pxd' not found

Do you have pyarrow installed?

saltylamon commented 4 years ago

After installing pyarrow and python3.7

error:

/code/cudf/python/cudf# python3.7 setup.py build_ext --inplace Compiling cudf/_lib/arrow/_cuda.pyx because it changed. Compiling cudf/_lib/avro.pyx because it changed. Compiling cudf/_lib/binops.pyx because it changed. Compiling cudf/_lib/concat.pyx because it changed. Compiling cudf/_lib/copying.pyx because it changed. Compiling cudf/_lib/csv.pyx because it changed. Compiling cudf/_lib/cudf.pyx because it changed. Compiling cudf/_lib/dlpack.pyx because it changed. Compiling cudf/_lib/filling.pyx because it changed. Compiling cudf/_lib/gpuarrow.pyx because it changed. Compiling cudf/_lib/groupby.pyx because it changed. Compiling cudf/_lib/hash.pyx because it changed. Compiling cudf/_lib/issorted.pyx because it changed. Compiling cudf/_lib/join.pyx because it changed. Compiling cudf/_lib/nvtx.pyx because it changed. Compiling cudf/_lib/orc.pyx because it changed. Compiling cudf/_lib/quantile.pyx because it changed. Compiling cudf/_lib/reduce.pyx because it changed. Compiling cudf/_lib/replace.pyx because it changed. Compiling cudf/_lib/reshape.pyx because it changed. Compiling cudf/_lib/rolling.pyx because it changed. Compiling cudf/_lib/search.pyx because it changed. Compiling cudf/_lib/sort.pyx because it changed. Compiling cudf/_lib/stream_compaction.pyx because it changed. Compiling cudf/_lib/table.pyx because it changed. Compiling cudf/_lib/transpose.pyx because it changed. Compiling cudf/_lib/typecast.pyx because it changed. Compiling cudf/_lib/unaryops.pyx because it changed. Compiling cudf/_lib/utils.pyx because it changed. Compiling cudf/_libxx/aggregation.pyx because it changed. Compiling cudf/_libxx/arrow/_cuda.pyx because it changed. Compiling cudf/_libxx/avro.pyx because it changed. Compiling cudf/_libxx/column.pyx because it changed. Compiling cudf/_libxx/copying.pyx because it changed. Compiling cudf/_libxx/dlpack.pyx because it changed. Compiling cudf/_libxx/gpuarrow.pyx because it changed. Compiling cudf/_libxx/hash.pyx because it changed. Compiling cudf/_libxx/io/utils.pyx because it changed. Compiling cudf/_libxx/join.pyx because it changed. Compiling cudf/_libxx/json.pyx because it changed. Compiling cudf/_libxx/merge.pyx because it changed. Compiling cudf/_libxx/null_mask.pyx because it changed. Compiling cudf/_libxx/orc.pyx because it changed. Compiling cudf/_libxx/parquet.pyx because it changed. Compiling cudf/_libxx/quantiles.pyx because it changed. Compiling cudf/_libxx/reduce.pyx because it changed. Compiling cudf/_libxx/replace.pyx because it changed. Compiling cudf/_libxx/reshape.pyx because it changed. Compiling cudf/_libxx/rolling.pyx because it changed. Compiling cudf/_libxx/scalar.pyx because it changed. Compiling cudf/_libxx/search.pyx because it changed. Compiling cudf/_libxx/sort.pyx because it changed. Compiling cudf/_libxx/stream_compaction.pyx because it changed. Compiling cudf/_libxx/string_casting.pyx because it changed. Compiling cudf/_libxx/strings/char_types.pyx because it changed. Compiling cudf/_libxx/strings/replace.pyx because it changed. Compiling cudf/_libxx/strings/substring.pyx because it changed. Compiling cudf/_libxx/strings/wrap.pyx because it changed. Compiling cudf/_libxx/table.pyx because it changed. Compiling cudf/_libxx/transform.pyx because it changed. Compiling cudf/_libxx/transpose.pyx because it changed. Compiling cudf/_libxx/types.pyx because it changed. Compiling cudf/_libxx/unary.pyx because it changed. [ 1/63] Cythonizing cudf/_lib/arrow/_cuda.pyx

Error compiling Cython file:

... """ def cinit(self, CudaBuffer obj): self.buffer = obj self.reader = new CCudaBufferReader(self.buffer.buffer) self.set_random_access_file( shared_ptrRandomAccessFile) ^

cudf/_lib/arrow/_cuda.pyx:735:23: unknown type in template argument

Error compiling Cython file:

... buffering. """ def cinit(self, CudaBuffer buffer): self.buffer = buffer self.writer = new CCudaBufferWriter(self.buffer.cuda_buffer) self.set_output_stream(shared_ptrOutputStream) ^

cudf/_lib/arrow/_cuda.pyx:782:42: unknown type in template argument

Error compiling Cython file:

...

    with nogil:
        if whence == 0:
            offset = position
        elif whence == 1:
            check_status(self.writer.Tell(&offset))
                                        ^

cudf/_lib/arrow/_cuda.pyx:817:45: Call with wrong number of arguments (expected 0, got 1) Traceback (most recent call last): File "setup.py", line 84, in profile=False, language_level=3, embedsignature=True File "/usr/local/lib/python3.7/dist-packages/Cython/Build/Dependencies.py", line 1101, in cythonize cythonize_one(*args) File "/usr/local/lib/python3.7/dist-packages/Cython/Build/Dependencies.py", line 1224, in cythonize_one raise CompileError(None, pyx_file) Cython.Compiler.Errors.CompileError: cudf/_lib/arrow/_cuda.pyx

aba312 commented 4 years ago

I have the same or similar issue with Python3.6.

cudf/_lib/arrow/_cuda.pyx:806:42: Constructing Python tuple not allowed without gil Traceback (most recent call last): File "setup.py", line 84, in profile=False, language_level=3, embedsignature=True File "/home/\/.local/lib/python3.6/site-packages/Cython/Build/Dependencies.py", line 1102, in cythonize cythonize_one(*args) File "/home/\/.local/lib/python3.6/site-packages/Cython/Build/Dependencies.py", line 1225, in cythonize_one raise CompileError(None, pyx_file) Cython.Compiler.Errors.CompileError: cudf/_lib/arrow/_cuda.pyx

However pyarrow imports without issue:

Python 3.6.9 (default, Nov 7 2019, 10:44:02) [GCC 8.3.0] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import pyarrow >>> exit()

This is failing right away when attempting to cythonize cudf/_lib/arrow/_cuda.pyx

Here are the other errors:

cudf/_lib/arrow/_cuda.pyx:806:36: Object of type 'CCudaBufferWriter' has no attribute 'Flush'

cudf/_lib/arrow/_cuda.pyx:806:42: Cannot convert Python object to 'const CStatus'

cudf/_lib/arrow/_cuda.pyx:806:42: Coercion from Python not allowed without the GIL

kkraus14 commented 4 years ago

@aba312 how did you install the arrow CPP library and PyArrow?

aba312 commented 4 years ago

Okay - I'm going to include the following for anyone who comes along frustrated as I was.

By the way pytest should be included as an environment dependency in your documentation.

TL:DR I was able to get CUDF installed, however the test output is still failing, and pyarrow seg faults when I try to import it. So I still don't have a working environment (I've included the test output at the bottom).

I'd be happy to move this to a different issue or location if it makes sense for it to be elsewhere maybe in something about build instructions because starting from a fresh installation of CUDA 10.2, python3 and Ubuntu 18.04 has been challenging to say the least.

Thank you,


It turns out my pyarrow was probably installed via pip and was version 0.12.1 (so a red herring)

Arrow was downloaded, built and installed as part of the CUDF build and install. I'm building from source but not using Conda.

The Arrow configuration is controlled by a cmake module in:

CUDF_HOME/cpp/cmake/Module/ConfigureArrow.cmake

In here I edited the ARROW_PYTHON and ARROW_BUILD_SHARED variables to be "ON" if these are not edited here, cmake simply overwrites the command line variables with whatever is in the ConfigureArrow.cmake module description.

I also had to re symlink my python distribution so that python=python3, a la this stack overflow solution. I symlinked back afterwards so as to not potentially break Ubuntu in the future.

This allowed me to get libarrow_python.so built and after an ldconfig update python sees the library, however when I attempt to import pyarrow, I get:

>>> import pyarrow >>> Segmentation fault (core dumped)

Despite this, I was able to run without error:

CUDF_HOME\python\cudf$ python setup.py build_ext --inplace

and

CUDF_HOME\python\cudf$ python setup.py --install --user

However when I then attempt to run pytest I get the following (edited to remove user names):

PyTest Errors:

> =================================== test session starts ==================================== platform linux -- Python 3.6.9, pytest-5.4.1, py-1.8.1, pluggy-0.13.1 -- /usr/bin/python3 cachedir: .pytest_cache rootdir: /home//cudf/python collected 0 items / 1 error > ========================================== ERRORS ========================================== ______________________________ ERROR collecting test session _______________________________ ../../.local/lib/python3.6/site-packages/_pytest/config/__init__.py:495: in _importconftest return self._conftestpath2mod[key] E KeyError: PosixPath('/home/\/cudf/python/cudf/cudf/tests/conftest.py') > During handling of the above exception, another exception occurred: ../../.local/lib/python3.6/site-packages/_pytest/config/__init__.py:501: in _importconftest mod = conftestpath.pyimport() ../../.local/lib/python3.6/site-packages/py/_path/local.py:701: in pyimport __import__(modname) cudf/cudf/__init__.py:5: in validate_setup() cudf/cudf/utils/gpu_utils.py:2: in validate_setup from cudf._cuda.gpu import ( E ImportError: /home/\/cudf/python/cudf/cudf/_cuda/gpu.cpython-36m-x86_64-linux-gnu.so: undefined symbol: cudaDriverGetVersion > During handling of the above exception, another exception occurred: ../../.local/lib/python3.6/site-packages/py/_path/common.py:383: in visit for x in Visitor(fil, rec, ignore, bf, sort).gen(self): ../../.local/lib/python3.6/site-packages/py/_path/common.py:435: in gen for p in self.gen(subdir): ../../.local/lib/python3.6/site-packages/py/_path/common.py:435: in gen for p in self.gen(subdir): ../../.local/lib/python3.6/site-packages/py/_path/common.py:424: in gen dirs = self.optsort([p for p in entries ../../.local/lib/python3.6/site-packages/py/_path/common.py:425: in if p.check(dir=1) and (rec is None or rec(p))]) ../../.local/lib/python3.6/site-packages/_pytest/nodes.py:506: in _recurse ihook = self._gethookproxy(dirpath) ../../.local/lib/python3.6/site-packages/_pytest/nodes.py:487: in _gethookproxy my_conftestmodules = pm._getconftestmodules(fspath) ../../.local/lib/python3.6/site-packages/_pytest/config/__init__.py:473: in _getconftestmodules mod = self._importconftest(conftestpath) ../../.local/lib/python3.6/site-packages/_pytest/config/__init__.py:509: in _importconftest raise ConftestImportFailure(conftestpath, sys.exc_info()) E _pytest.config.ConftestImportFailure: (local('/home/\/cudf/python/cudf/cudf/tests/conftest.py'), (, ImportError('/home/\/cudf/python/cudf/cudf/_cuda/gpu.cpython-36m-x86_64-linux-gnu.so: undefined symbol: cudaDriverGetVersion',), )) ================================= short test summary info ================================== ERROR - _pytest.config.ConftestImportFailure: (local('/home/\/cudf/python/cudf/cudf... !!!!!!!!!!!!!!!!!!!!!!!!!! Interrupted: 1 error during collection !!!!!!!!!!!!!!!!!!!!!!!!!! ===================================== 1 error in 0.15s =====================================

So obviously my configuration and libraries are still not correct.

kkraus14 commented 4 years ago

@aba312 I'm going to preface this by saying I would highly suggest to use conda if it's an option for you.

The ConfigureArrow.cmake is configured to build an arrow static library to statically link to within libcudf for a few small functions and the Arrow CUDA functions which currently aren't distributed anywhere. It wasn't designed or intended to build an installable arrow library. If installing arrow from pip, everything in the pip package is built assuming the old pre C++11 ABI. By default, libcudf and cudf build with the new C++11 ABI. This means that whichever one is loaded first loads the symbols from its ABI version and potentially blows up the other one.

You can control the ABI of libcudf via the -DCMAKE_CXX11_ABI variable with instructions here: https://github.com/rapidsai/cudf/blob/branch-0.14/CONTRIBUTING.md#build-from-source

Your pytest above is failing from not finding a symbol from cuda, can you run ldd /home/<username>/cudf/python/cudf/cudf/_cuda/gpu.cpython-36m-x86_64-linux-gnu.so and dump the output here? I'm guessing it's failing to find either libcudart or libcuda.

By the way pytest should be included as an environment dependency in your documentation.

Thanks will submit an issue to get this fixed.

aba312 commented 4 years ago

@kkraus14 First off, thank you for your assistance.

I am hesitant to use conda because I have had bad luck with it in the past getting certain other packages to work and play nicely. However, I'm beginning to run out of ideas here.

My pyArrow is being built from source against the C++11 ABI and CUDA. It now imports without segmentation faults. As a note your contributing guide should discuss the pyarrow issue in a bit more detail. Is a full install of pyArrow a dependency or does CUDF only need what it downloads and builds?

Still no CUDF success: I backed out my changes to the cmake module for Arrow configuration though I kept the shared library ON as the make command for CUDF failed looking for a libarrow shared library. Also as a note I'm attempting to build the master branch of CUDF. Also of note CUDF builds and compiles and "make test" completes successfully (all 107 tests pass). However py.test -v still fails (though it's a different failure now).

Here is the output of pytest:

>========================================== test session starts ========================================== platform linux -- Python 3.6.9, pytest-5.4.1, py-1.8.1, pluggy-0.13.1 -- /usr/bin/python3 cachedir: .pytest_cache hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/home/\/cudf/python/.hypothesis/examples') rootdir: /home/\/cudf/python plugins: hypothesis-5.8.0 collected 0 items / 1 error >================================================ ERRORS ================================================= _____________________________________ ERROR collecting test session _____________________________________ ../../.local/lib/python3.6/site-packages/_pytest/config/__init__.py:495: in _importconftest return self._conftestpath2mod[key] E KeyError: PosixPath('/home/\/cudf/python/cudf/cudf/tests/conftest.py') >During handling of the above exception, another exception occurred: ../../.local/lib/python3.6/site-packages/_pytest/config/__init__.py:501: in _importconftest mod = conftestpath.pyimport() ../../.local/lib/python3.6/site-packages/py/_path/local.py:701: in pyimport __import__(modname) cudf/cudf/__init__.py:7: in from cudf import core, datasets cudf/cudf/core/__init__.py:3: in from cudf.core import buffer, column cudf/cudf/core/column/__init__.py:1: in from cudf.core.column.categorical import CategoricalColumn # noqa: F401 cudf/cudf/core/column/categorical.py:11: in import cudf._libxx as libcudfxx cudf/cudf/_libxx/__init__.py:5: in from . import ( cudf/_libxx/arrow/_cuda.pxd:29: in init cudf._libxx.gpuarrow ??? E ImportError: /home/\/cudf/python/cudf/cudf/_libxx/arrow/_cuda.cpython-36m-x86_64-linux-gnu.so: undefined symbol: _ZN5arrow4cuda10CudaBuffer10FromBufferESt10shared_ptrINS_6BufferEEPS2_IS1_E >During handling of the above exception, another exception occurred: ../../.local/lib/python3.6/site-packages/py/_path/common.py:383: in visit for x in Visitor(fil, rec, ignore, bf, sort).gen(self): ../../.local/lib/python3.6/site-packages/py/_path/common.py:435: in gen for p in self.gen(subdir): ../../.local/lib/python3.6/site-packages/py/_path/common.py:435: in gen for p in self.gen(subdir): ../../.local/lib/python3.6/site-packages/py/_path/common.py:424: in gen dirs = self.optsort([p for p in entries ../../.local/lib/python3.6/site-packages/py/_path/common.py:425: in if p.check(dir=1) and (rec is None or rec(p))]) ../../.local/lib/python3.6/site-packages/_pytest/nodes.py:506: in _recurse ihook = self._gethookproxy(dirpath) ../../.local/lib/python3.6/site-packages/_pytest/nodes.py:487: in _gethookproxy my_conftestmodules = pm._getconftestmodules(fspath) ../../.local/lib/python3.6/site-packages/_pytest/config/__init__.py:473: in _getconftestmodules mod = self._importconftest(conftestpath) ../../.local/lib/python3.6/site-packages/_pytest/config/__init__.py:509: in _importconftest raise ConftestImportFailure(conftestpath, sys.exc_info()) E _pytest.config.ConftestImportFailure: (local('/home/\/cudf/python/cudf/cudf/tests/conftest.py'), (, ImportError('/home/\/cudf/python/cudf/cudf/_libxx/arrow/_cuda.cpython-36m-x86_64-linux-gnu.so: undefined symbol: _ZN5arrow4cuda10CudaBuffer10FromBufferESt10shared_ptrINS_6BufferEEPS2_IS1_E',), )) ======================================== short test summary info ======================================== ERROR - _pytest.config.ConftestImportFailure: (local('/home/\/cudf/python/cudf/cudf/tests/confte... !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Interrupted: 1 error during collection !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! =========================================== 1 error in 0.72s ============================================

This same error occurs when I attempt to import CUDF in python.

The ldd on the failing library _cuda.cpython-36m-x86_64-linux-gno.so:

>$ ldd /home/\/cudf/python/cudf/cudf/_libxx/arrow/_cuda.cpython-36m-x86_64-linux-gnu.so linux-vdso.so.1 (0x00007ffd18982000) libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f55217d1000) libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f55215b9000) libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f552139a000) libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f5520fa9000) libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f5520c0b000) /lib64/ld-linux-x86-64.so.2 (0x00007f5521d94000)

Additionally after build I no longer have /home//cudf/python/cudf/cudf/_cuda/ - instead the contents of ~/cudf/python/cudf/cudf are:

comm core datasets.py init.py io _lib _libxx pycache tests utils _version.py

Also regarding the configuration of arrow for cudf, its c++ compiler flag was set to -std=new instead of -std=c++11, thus it was compiling with c++14 even with -DCMAKE_CXX11_ABI=ON. I'm not sure if this is actually an issue, but for consistency, CUDF's arrow compiles with C++11 if the flag "-DARROW_CXXFLAGS=-std=c++11" is added to the configureArrow.cmake.

kkraus14 commented 4 years ago

My pyArrow is being built from source against the C++11 ABI and CUDA. It now imports without segmentation faults. As a note your contributing guide should discuss the pyarrow issue in a bit more detail. Is a full install of pyArrow a dependency or does CUDF only need what it downloads and builds?

CUDF only builds and installs what it needs. This includes: installing the arrow cpp GPU related headers that aren't included in a typical install of the arrow-cpp conda package, building a static library of the minimal amount of arrow cpp host functions which we statically link into libcudf, and building a static library of the arrow cpp gpu functions which we statically link the whole archive into libcudf (for use in vendored pyarrow gpu cython).

However py.test -v still fails (though it's a different failure now).

It looks like it's failing to find / link to libcudf.so which should have those symbols included from static linking the entire archive. Could you dump the output of ldd /home/<username>/cudf/python/cudf/cudf/_libxx/arrow/_cuda.cpython-36m-x86_64-linux-gnu.so?

The ldd on the failing library _cuda.cpython-36m-x86_64-linux-gno.so:

This looks wrong to me. Here's the output of mine:

    linux-vdso.so.1 (0x00007ffff1f77000)
    libcudf.so => /home/nfs/kkraus/miniconda3/envs/cudf_dev/lib/libcudf.so (0x00007f70eadde000)
    libstdc++.so.6 => /home/nfs/kkraus/miniconda3/envs/cudf_dev/lib/libstdc++.so.6 (0x00007f71176a8000)
    libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f70eaa40000)
    libgcc_s.so.1 => /home/nfs/kkraus/miniconda3/envs/cudf_dev/lib/libgcc_s.so.1 (0x00007f7117682000)
    libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f70ea821000)
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f70ea430000)
    libNVCategory.so => /home/nfs/kkraus/miniconda3/envs/cudf_dev/lib/libNVCategory.so (0x00007f70e6186000)
    libNVStrings.so => /home/nfs/kkraus/miniconda3/envs/cudf_dev/lib/libNVStrings.so (0x00007f70e0d3a000)
    librmm.so => /home/nfs/kkraus/miniconda3/envs/cudf_dev/lib/librmm.so (0x00007f70e0b23000)
    libnvrtc.so.10.1 => /home/nfs/kkraus/miniconda3/envs/cudf_dev/lib/libnvrtc.so.10.1 (0x00007f70df3b2000)
    libcudart.so.10.1 => /home/nfs/kkraus/miniconda3/envs/cudf_dev/lib/libcudart.so.10.1 (0x00007f70df133000)
    libcuda.so.1 => /usr/lib/x86_64-linux-gnu/libcuda.so.1 (0x00007f70ddf4b000)
    libz.so.1 => /home/nfs/kkraus/miniconda3/envs/cudf_dev/lib/libz.so.1 (0x00007f7117664000)
    libboost_filesystem.so.1.70.0 => /home/nfs/kkraus/miniconda3/envs/cudf_dev/lib/libboost_filesystem.so.1.70.0 (0x00007f70ddf2b000)
    libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f70ddd27000)
    /lib64/ld-linux-x86-64.so.2 (0x00007f7117634000)
    librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f70ddb1f000)

It looks like it's failing to libcudf.so which then includes all of the transitive dependencies.

Where is libcudf.so being installed for you?

aba312 commented 4 years ago

My libcudf.so was installed to /usr/local/lib. I was able to rebuild cudf (I think I had some old environment issues last week) and get the pyTest to run but it did require rebuilding pyArrow based on randyzwitch.com | Building pyarrow with CUDA support and ensuring that before building pyArrow I used the following flags: export PYARROW_WITH_PARQUET=1 export PYARROW_WITH_CUDA=1 export PYARROW_WITH_ORC=1 However, even though the pyTests ran, they didn't pass and when looking at the CUDF getting started the CUDF dataframe kept throwing DeviceNDArray has no attribute "nbytes" in column.py. This must be a simple error, but I wasn't able to find anything on Google about what might be causing it, maybe Numba?

Here's the pytest summary:

14752 failed, 5952 passed, 947 skipped, 599 xfailed, 4 xpassed, 2580 warnings, 6 errors in 2643.77s (0:44:03)

I piped the pyTest output file to a text file which is 45 MB, if you are interested here is a firefox send link: https://send.firefox.com/download/dc1be87f210728b1/#tS0zJQ7ITCW4Os6N8uN67g

which will expire in 7 days or after 5 downloads.

Previously when I had tried Miniconda the environment failed to solve. This time however, it looks like everything worked and I seem to have a working (or at least more functional than building it myself) install.

Thank you again for all of your assistance.

~

kkraus14 commented 4 years ago

Yes that's from Numba, what version of Numba do you have installed? We require >=0.48.0.

aba312 commented 4 years ago

I had numba 0.41 (which is what pip installed without forcing pip install numba==0.48). I was also missing partd.

However after this, I now have all but 3 tests passing! I'm super excited!

======================================================================== short test summary info ========================================================================= FAILED dask_cudf/dask_cudf/io/tests/test_csv.py::test_csv_roundtrip - TypeError: argument of type 'PosixPath' is not iterable FAILED dask_cudf/dask_cudf/io/tests/test_csv.py::test_read_csv - TypeError: argument of type 'PosixPath' is not iterable FAILED dask_cudf/dask_cudf/io/tests/test_json.py::test_read_json - TypeError: argument of type 'PosixPath' is not iterable ==================================== 3 failed, 20714 passed, 937 skipped, 548 xfailed, 58 xpassed, 2927 warnings in 768.65s (0:12:48) ====================================

I don't need multi-gpu support right now, so if part of dask is failing, I'm not too worried. I get to work on CUML now :).

I would recommend that somewhere a document or a table in "CONTRIBUTING.md" is added that indicates the required versions for all dependencies for those who want to build from scratch.

I would include that arrow/pyArrow needs to be built against CUDA separately from what CUDF includes.

I honestly could not have gotten this building without your assistance, which unfortunately implies that the build documentation needs some work. I also understand the challenge of testing builds on multiple configurations and the challenge of "starting from scratch".

Thank you again.

~

kkraus14 commented 4 years ago

Happy it's mostly working! What branch of cudf are you using? One of Dask's dependencies updated recently that caused that issue which I believe we fixed in the latest code of 0.14.

Agreed we could use some updated documentation here to improve the user experience in building from source. One of the challenges here is if the user is using pip installed packages like pyarrow, they're built with the pre-C++11 ABI, where when building libcudf by default will use the new ABI and all of the sudden things explode. Additionally, we can't really find the cpp libs buried in the Python installation directories and our CMake shouldn't install a bunch of things into your system prefix.

Would you mind opening a github issue about improving the building from source documentation?

kkraus14 commented 4 years ago

Closing as resolved.