rapidsai / dask-cuda

Utilities for Dask and CUDA interactions
https://docs.rapids.ai/api/dask-cuda/stable/
Apache License 2.0
292 stars 93 forks source link

Solver functions give "no kernel image is available for execution on the device" #318

Closed davidnoz123 closed 4 years ago

davidnoz123 commented 4 years ago

I've cobbled together a couple of dask_cuda examples I've found on the web (see below). I believe it should work but I get a "no kernel image is available for execution on the device" exception on line "XT = cumlModel.fit_transform(X_cudf)". Can anyone get this working on their setup? If so, then can you reply with your "conda list" output? Thanks in advance!

import sys
import time
import numpy as np
from cuml.dask.decomposition import PCA
from cuml.dask.datasets import make_blobs

from dask.distributed import Client, wait
from dask_cuda import LocalCUDACluster

from pandas.util.testing import assert_frame_equal
import cudf as gd
import dask_cudf as dgd

if __name__ == '__main__':
    cluster = LocalCUDACluster()
    client = Client(cluster)
    time.sleep(5)
    print("cluster status ",cluster.status)
    print("cluster infomarion ", cluster)
    print("client information ",client)

    # no. of gpu and no. of worker is same.   

    nelem = 10000000

    df = gd.DataFrame()
    df["x"] = np.arange(nelem)
    df["y"] = np.random.randint(nelem, size=nelem)

    ddf = dgd.from_cudf(df, npartitions=5)

    delays = ddf.to_delayed()

    assert len(delays) == 5

    # Concat the delayed partitions
    got = gd.concat([d.compute() for d in delays])
    assert_frame_equal(got.to_pandas(), df.to_pandas())

    nrows = 6
    ncols = 3
    n_parts = 2

    X_cudf, _ = make_blobs(6, 3, centers=1, n_parts=2, random_state=10, dtype=np.float64)

    wait(X_cudf)

    print("Input Matrix")
    print(X_cudf.compute())

    cumlModel = PCA(n_components = 1, whiten=False)
    XT = cumlModel.fit_transform(X_cudf)

    print("Transformed Input Matrix")
    print(XT.compute())
pentschev commented 4 years ago

This works for me with a fresh nightly build install but fails with stable (0.14). It seems to come from cuml prims:

```python cluster status running cluster infomarion LocalCUDACluster('tcp://127.0.0.1:45959', workers=8, threads=8, memory=1.08 TB) client information Input Matrix [[-1.74721092 2.14871726 6.56769612] [-1.02317438 2.55977184 5.86181299] [-4.19011622 4.88020029 6.81418569] [-2.8100962 2.54433874 8.15304742] [-3.78735068 4.28623727 7.86674681] [-3.32177127 2.30468961 7.09697537]] distributed.worker - WARNING - Compute Failed Function: _func_fit args: (PCAMG(), [array([[-2.8100962 , 2.54433874, 8.15304742], [-3.78735068, 4.28623727, 7.86674681], [-3.32177127, 2.30468961, 7.09697537]])], 6, 3, [3], 1, False) kwargs: {} Exception: RuntimeError("Exception occured! file=/conda/conda-bld/libcumlprims_1591198485167/work/cpp/build/cuml/src/cuml/cpp/src_prims/stats/sum.cuh line=94: FAIL: call='cudaPeekAtLastError()'. Reason:no kernel image is available for execution on the device\nObtained 37 stack frames\n#0 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/lib/python3.7/site-packages/cuml/common/pointer_utils.cpython-37m-x86_64-linux-gnu.so(_ZN8MLCommon9Exception16collectCallStackEv+0x3e) [0x7f0fd8e0913e]\n#1 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/lib/python3.7/site-packages/cuml/common/pointer_utils.cpython-37m-x86_64-linux-gnu.so(_ZN8MLCommon9ExceptionC1ERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x80) [0x7f0fd8e09c50]\n#2 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/lib/python3.7/site-packages/cuml/common/../../../../libcumlprims.so(_ZN8MLCommon5Stats3opg9mean_implIdLi256EEEvRNS_6Matrix4DataIT_EERKSt6vectorIPS6_SaIS9_EERKNS3_14PartDescriptorERKNS_16cumlCommunicatorESt10shared_ptrINS_15deviceAllocatorEEPP11CUstream_stiP13cublasContext+0xf70) [0x7f0fa317a790]\n#3 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/lib/python3.7/site-packages/cuml/common/../../../../libcumlprims.so(_ZN8MLCommon5Stats3opg4meanERNS_6Matrix4DataIdEERKSt6vectorIPS4_SaIS7_EERKNS2_14PartDescriptorERKNS_16cumlCommunicatorESt10shared_ptrINS_15deviceAllocatorEEPP11CUstream_stiP13cublasContext+0x4e) [0x7f0fa317960e]\n#4 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/lib/python3.7/site-packages/cuml/common/../../../../libcumlprims.so(_ZN2ML3PCA3opg8fit_implIdEEvRNS_10cumlHandleERSt6vectorIPN8MLCommon6Matrix4DataIT_EESaISB_EERNS7_14PartDescriptorEPS9_SH_SH_SH_SH_SH_NS_9paramsPCAEPP11CUstream_stib+0x12a) [0x7f0fa312448a]\n#5 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/lib/python3.7/site-packages/cuml/common/../../../../libcumlprims.so(_ZN2ML3PCA3opg8fit_implIdEEvRNS_10cumlHandleEPPN8MLCommon6Matrix12RankSizePairEmPPNS6_4DataIT_EEPSB_SF_SF_SF_SF_SF_NS_9paramsPCAEb+0x284) [0x7f0fa3124a14]\n#6 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/lib/python3.7/site-packages/cuml/common/../../../../libcumlprims.so(_ZN2ML3PCA3opg3fitERNS_10cumlHandleEPPN8MLCommon6Matrix12RankSizePairEmPPNS5_4DataIdEEPdSD_SD_SD_SD_SD_NS_9paramsPCAEb+0x75) [0x7f0fa310b065]\n#7 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/lib/python3.7/site-packages/cuml/decomposition/pca_mg.cpython-37m-x86_64-linux-gnu.so(+0x969e) [0x7f0f8017269e]\n#8 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/lib/python3.7/site-packages/cuml/decomposition/pca_mg.cpython-37m-x86_64-linux-gnu.so(+0xad9b) [0x7f0f80173d9b]\n#9 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/lib/python3.7/site-packages/cuml/decomposition/base_mg.cpython-37m-x86_64-linux-gnu.so(+0x6210) [0x7f0f8015b210]\n#10 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/lib/python3.7/site-packages/cuml/decomposition/base_mg.cpython-37m-x86_64-linux-gnu.so(+0xbbcb) [0x7f0f80160bcb]\n#11 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/lib/python3.7/site-packages/cuml/decomposition/base_mg.cpython-37m-x86_64-linux-gnu.so(+0xe90d) [0x7f0f8016390d]\n#12 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(_PyObject_FastCallKeywords+0x15c) [0x562d053d6bec]\n#13 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(+0x181661) [0x562d053d7661]\n#14 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(_PyEval_EvalFrameDefault+0x48a2) [0x562d0541d762]\n#15 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(_PyFunction_FastCallDict+0x118) [0x562d0538e138]\n#16 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(_PyEval_EvalFrameDefault+0x1ccd) [0x562d0541ab8d]\n#17 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(_PyFunction_FastCallDict+0x118) [0x562d0538e138]\n#18 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(_PyEval_EvalFrameDefault+0x1ccd) [0x562d0541ab8d]\n#19 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(_PyFunction_FastCallKeywords+0x187) [0x562d0538ed37]\n#20 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(+0x181455) [0x562d053d7455]\n#21 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(_PyEval_EvalFrameDefault+0x611) [0x562d054194d1]\n#22 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(_PyFunction_FastCallDict+0x118) [0x562d0538e138]\n#23 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(_PyEval_EvalFrameDefault+0x1ccd) [0x562d0541ab8d]\n#24 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(_PyFunction_FastCallKeywords+0x187) [0x562d0538ed37]\n#25 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(+0x181455) [0x562d053d7455]\n#26 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(_PyEval_EvalFrameDefault+0x611) [0x562d054194d1]\n#27 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(_PyFunction_FastCallKeywords+0x187) [0x562d0538ed37]\n#28 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(+0x181455) [0x562d053d7455]\n#29 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(_PyEval_EvalFrameDefault+0x611) [0x562d054194d1]\n#30 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(_PyObject_FastCallDict+0x1b6) [0x562d05370ab6]\n#31 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(+0x12f071) [0x562d05385071]\n#32 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(PyObject_Call+0xb4) [0x562d05371214]\n#33 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(+0x2218e3) [0x562d054778e3]\n#34 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(+0x1dac27) [0x562d05430c27]\n#35 in /lib/x86_64-linux-gnu/libpthread.so.0(+0x76db) [0x7f10d5a326db]\n#36 in /lib/x86_64-linux-gnu/libc.so.6(clone+0x3f) [0x7f10d575b88f]\n") distributed.worker - WARNING - Compute Failed Function: _func_fit args: (PCAMG(), [array([[-1.74721092, 2.14871726, 6.56769612], [-1.02317438, 2.55977184, 5.86181299], [-4.19011622, 4.88020029, 6.81418569]])], 6, 3, [3], 0, False) kwargs: {} Exception: RuntimeError("Exception occured! file=/conda/conda-bld/libcumlprims_1591198485167/work/cpp/build/cuml/src/cuml/cpp/src_prims/stats/sum.cuh line=94: FAIL: call='cudaPeekAtLastError()'. Reason:no kernel image is available for execution on the device\nObtained 37 stack frames\n#0 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/lib/python3.7/site-packages/cuml/common/pointer_utils.cpython-37m-x86_64-linux-gnu.so(_ZN8MLCommon9Exception16collectCallStackEv+0x3e) [0x7fbaa193913e]\n#1 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/lib/python3.7/site-packages/cuml/common/pointer_utils.cpython-37m-x86_64-linux-gnu.so(_ZN8MLCommon9ExceptionC1ERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x80) [0x7fbaa1939c50]\n#2 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/lib/python3.7/site-packages/cuml/common/../../../../libcumlprims.so(_ZN8MLCommon5Stats3opg9mean_implIdLi256EEEvRNS_6Matrix4DataIT_EERKSt6vectorIPS6_SaIS9_EERKNS3_14PartDescriptorERKNS_16cumlCommunicatorESt10shared_ptrINS_15deviceAllocatorEEPP11CUstream_stiP13cublasContext+0xf70) [0x7fba92a98790]\n#3 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/lib/python3.7/site-packages/cuml/common/../../../../libcumlprims.so(_ZN8MLCommon5Stats3opg4meanERNS_6Matrix4DataIdEERKSt6vectorIPS4_SaIS7_EERKNS2_14PartDescriptorERKNS_16cumlCommunicatorESt10shared_ptrINS_15deviceAllocatorEEPP11CUstream_stiP13cublasContext+0x4e) [0x7fba92a9760e]\n#4 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/lib/python3.7/site-packages/cuml/common/../../../../libcumlprims.so(_ZN2ML3PCA3opg8fit_implIdEEvRNS_10cumlHandleERSt6vectorIPN8MLCommon6Matrix4DataIT_EESaISB_EERNS7_14PartDescriptorEPS9_SH_SH_SH_SH_SH_NS_9paramsPCAEPP11CUstream_stib+0x12a) [0x7fba92a4248a]\n#5 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/lib/python3.7/site-packages/cuml/common/../../../../libcumlprims.so(_ZN2ML3PCA3opg8fit_implIdEEvRNS_10cumlHandleEPPN8MLCommon6Matrix12RankSizePairEmPPNS6_4DataIT_EEPSB_SF_SF_SF_SF_SF_NS_9paramsPCAEb+0x284) [0x7fba92a42a14]\n#6 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/lib/python3.7/site-packages/cuml/common/../../../../libcumlprims.so(_ZN2ML3PCA3opg3fitERNS_10cumlHandleEPPN8MLCommon6Matrix12RankSizePairEmPPNS5_4DataIdEEPdSD_SD_SD_SD_SD_NS_9paramsPCAEb+0x75) [0x7fba92a29065]\n#7 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/lib/python3.7/site-packages/cuml/decomposition/pca_mg.cpython-37m-x86_64-linux-gnu.so(+0x969e) [0x7fba8407969e]\n#8 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/lib/python3.7/site-packages/cuml/decomposition/pca_mg.cpython-37m-x86_64-linux-gnu.so(+0xad9b) [0x7fba8407ad9b]\n#9 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/lib/python3.7/site-packages/cuml/decomposition/base_mg.cpython-37m-x86_64-linux-gnu.so(+0x6210) [0x7fba84062210]\n#10 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/lib/python3.7/site-packages/cuml/decomposition/base_mg.cpython-37m-x86_64-linux-gnu.so(+0xbbcb) [0x7fba84067bcb]\n#11 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/lib/python3.7/site-packages/cuml/decomposition/base_mg.cpython-37m-x86_64-linux-gnu.so(+0xe90d) [0x7fba8406a90d]\n#12 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(_PyObject_FastCallKeywords+0x15c) [0x564ad9cc2bec]\n#13 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(+0x181661) [0x564ad9cc3661]\n#14 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(_PyEval_EvalFrameDefault+0x48a2) [0x564ad9d09762]\n#15 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(_PyFunction_FastCallDict+0x118) [0x564ad9c7a138]\n#16 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(_PyEval_EvalFrameDefault+0x1ccd) [0x564ad9d06b8d]\n#17 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(_PyFunction_FastCallDict+0x118) [0x564ad9c7a138]\n#18 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(_PyEval_EvalFrameDefault+0x1ccd) [0x564ad9d06b8d]\n#19 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(_PyFunction_FastCallKeywords+0x187) [0x564ad9c7ad37]\n#20 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(+0x181455) [0x564ad9cc3455]\n#21 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(_PyEval_EvalFrameDefault+0x611) [0x564ad9d054d1]\n#22 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(_PyFunction_FastCallDict+0x118) [0x564ad9c7a138]\n#23 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(_PyEval_EvalFrameDefault+0x1ccd) [0x564ad9d06b8d]\n#24 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(_PyFunction_FastCallKeywords+0x187) [0x564ad9c7ad37]\n#25 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(+0x181455) [0x564ad9cc3455]\n#26 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(_PyEval_EvalFrameDefault+0x611) [0x564ad9d054d1]\n#27 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(_PyFunction_FastCallKeywords+0x187) [0x564ad9c7ad37]\n#28 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(+0x181455) [0x564ad9cc3455]\n#29 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(_PyEval_EvalFrameDefault+0x611) [0x564ad9d054d1]\n#30 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(_PyObject_FastCallDict+0x1b6) [0x564ad9c5cab6]\n#31 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(+0x12f071) [0x564ad9c71071]\n#32 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(PyObject_Call+0xb4) [0x564ad9c5d214]\n#33 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(+0x2218e3) [0x564ad9d638e3]\n#34 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(+0x1dac27) [0x564ad9d1cc27]\n#35 in /lib/x86_64-linux-gnu/libpthread.so.0(+0x76db) [0x7fbbc534f6db]\n#36 in /lib/x86_64-linux-gnu/libc.so.6(clone+0x3f) [0x7fbbc507888f]\n") Traceback (most recent call last): File "dask-cuda-318.py", line 52, in XT = cumlModel.fit_transform(X_cudf) File "/datasets/pentschev/miniconda3/envs/rn-102-0.14/lib/python3.7/site-packages/cuml/dask/decomposition/pca.py", line 189, in fit_transform return self.fit(X).transform(X) File "/datasets/pentschev/miniconda3/envs/rn-102-0.14/lib/python3.7/site-packages/cuml/dask/decomposition/pca.py", line 173, in fit self._fit(X) File "/datasets/pentschev/miniconda3/envs/rn-102-0.14/lib/python3.7/site-packages/cuml/dask/decomposition/base.py", line 101, in _fit raise_exception_from_futures(list(pca_fit.values())) File "/datasets/pentschev/miniconda3/envs/rn-102-0.14/lib/python3.7/site-packages/cuml/dask/common/utils.py", line 144, in raise_exception_from_futures len(errs), len(futures), ", ".join(map(str, errs)) RuntimeError: 2 of 2 worker jobs failed: Exception occured! file=/conda/conda-bld/libcumlprims_1591198485167/work/cpp/build/cuml/src/cuml/cpp/src_prims/stats/sum.cuh line=94: FAIL: call='cudaPeekAtLastError()'. Reason:no kernel image is available for execution on the device Obtained 37 stack frames #0 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/lib/python3.7/site-packages/cuml/common/pointer_utils.cpython-37m-x86_64-linux-gnu.so(_ZN8MLCommon9Exception16collectCallStackEv+0x3e) [0x7fbaa193913e] #1 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/lib/python3.7/site-packages/cuml/common/pointer_utils.cpython-37m-x86_64-linux-gnu.so(_ZN8MLCommon9ExceptionC1ERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x80) [0x7fbaa1939c50] #2 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/lib/python3.7/site-packages/cuml/common/../../../../libcumlprims.so(_ZN8MLCommon5Stats3opg9mean_implIdLi256EEEvRNS_6Matrix4DataIT_EERKSt6vectorIPS6_SaIS9_EERKNS3_14PartDescriptorERKNS_16cumlCommunicatorESt10shared_ptrINS_15deviceAllocatorEEPP11CUstream_stiP13cublasContext+0xf70) [0x7fba92a98790] #3 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/lib/python3.7/site-packages/cuml/common/../../../../libcumlprims.so(_ZN8MLCommon5Stats3opg4meanERNS_6Matrix4DataIdEERKSt6vectorIPS4_SaIS7_EERKNS2_14PartDescriptorERKNS_16cumlCommunicatorESt10shared_ptrINS_15deviceAllocatorEEPP11CUstream_stiP13cublasContext+0x4e) [0x7fba92a9760e] #4 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/lib/python3.7/site-packages/cuml/common/../../../../libcumlprims.so(_ZN2ML3PCA3opg8fit_implIdEEvRNS_10cumlHandleERSt6vectorIPN8MLCommon6Matrix4DataIT_EESaISB_EERNS7_14PartDescriptorEPS9_SH_SH_SH_SH_SH_NS_9paramsPCAEPP11CUstream_stib+0x12a) [0x7fba92a4248a] #5 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/lib/python3.7/site-packages/cuml/common/../../../../libcumlprims.so(_ZN2ML3PCA3opg8fit_implIdEEvRNS_10cumlHandleEPPN8MLCommon6Matrix12RankSizePairEmPPNS6_4DataIT_EEPSB_SF_SF_SF_SF_SF_NS_9paramsPCAEb+0x284) [0x7fba92a42a14] #6 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/lib/python3.7/site-packages/cuml/common/../../../../libcumlprims.so(_ZN2ML3PCA3opg3fitERNS_10cumlHandleEPPN8MLCommon6Matrix12RankSizePairEmPPNS5_4DataIdEEPdSD_SD_SD_SD_SD_NS_9paramsPCAEb+0x75) [0x7fba92a29065] #7 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/lib/python3.7/site-packages/cuml/decomposition/pca_mg.cpython-37m-x86_64-linux-gnu.so(+0x969e) [0x7fba8407969e] #8 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/lib/python3.7/site-packages/cuml/decomposition/pca_mg.cpython-37m-x86_64-linux-gnu.so(+0xad9b) [0x7fba8407ad9b] #9 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/lib/python3.7/site-packages/cuml/decomposition/base_mg.cpython-37m-x86_64-linux-gnu.so(+0x6210) [0x7fba84062210] #10 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/lib/python3.7/site-packages/cuml/decomposition/base_mg.cpython-37m-x86_64-linux-gnu.so(+0xbbcb) [0x7fba84067bcb] #11 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/lib/python3.7/site-packages/cuml/decomposition/base_mg.cpython-37m-x86_64-linux-gnu.so(+0xe90d) [0x7fba8406a90d] #12 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(_PyObject_FastCallKeywords+0x15c) [0x564ad9cc2bec] #13 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(+0x181661) [0x564ad9cc3661] #14 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(_PyEval_EvalFrameDefault+0x48a2) [0x564ad9d09762] #15 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(_PyFunction_FastCallDict+0x118) [0x564ad9c7a138] #16 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(_PyEval_EvalFrameDefault+0x1ccd) [0x564ad9d06b8d] #17 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(_PyFunction_FastCallDict+0x118) [0x564ad9c7a138] #18 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(_PyEval_EvalFrameDefault+0x1ccd) [0x564ad9d06b8d] #19 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(_PyFunction_FastCallKeywords+0x187) [0x564ad9c7ad37] #20 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(+0x181455) [0x564ad9cc3455] #21 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(_PyEval_EvalFrameDefault+0x611) [0x564ad9d054d1] #22 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(_PyFunction_FastCallDict+0x118) [0x564ad9c7a138] #23 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(_PyEval_EvalFrameDefault+0x1ccd) [0x564ad9d06b8d] #24 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(_PyFunction_FastCallKeywords+0x187) [0x564ad9c7ad37] #25 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(+0x181455) [0x564ad9cc3455] #26 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(_PyEval_EvalFrameDefault+0x611) [0x564ad9d054d1] #27 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(_PyFunction_FastCallKeywords+0x187) [0x564ad9c7ad37] #28 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(+0x181455) [0x564ad9cc3455] #29 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(_PyEval_EvalFrameDefault+0x611) [0x564ad9d054d1] #30 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(_PyObject_FastCallDict+0x1b6) [0x564ad9c5cab6] #31 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(+0x12f071) [0x564ad9c71071] #32 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(PyObject_Call+0xb4) [0x564ad9c5d214] #33 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(+0x2218e3) [0x564ad9d638e3] #34 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(+0x1dac27) [0x564ad9d1cc27] #35 in /lib/x86_64-linux-gnu/libpthread.so.0(+0x76db) [0x7fbbc534f6db] #36 in /lib/x86_64-linux-gnu/libc.so.6(clone+0x3f) [0x7fbbc507888f] , Exception occured! file=/conda/conda-bld/libcumlprims_1591198485167/work/cpp/build/cuml/src/cuml/cpp/src_prims/stats/sum.cuh line=94: FAIL: call='cudaPeekAtLastError()'. Reason:no kernel image is available for execution on the device Obtained 37 stack frames #0 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/lib/python3.7/site-packages/cuml/common/pointer_utils.cpython-37m-x86_64-linux-gnu.so(_ZN8MLCommon9Exception16collectCallStackEv+0x3e) [0x7f0fd8e0913e] #1 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/lib/python3.7/site-packages/cuml/common/pointer_utils.cpython-37m-x86_64-linux-gnu.so(_ZN8MLCommon9ExceptionC1ERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x80) [0x7f0fd8e09c50] #2 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/lib/python3.7/site-packages/cuml/common/../../../../libcumlprims.so(_ZN8MLCommon5Stats3opg9mean_implIdLi256EEEvRNS_6Matrix4DataIT_EERKSt6vectorIPS6_SaIS9_EERKNS3_14PartDescriptorERKNS_16cumlCommunicatorESt10shared_ptrINS_15deviceAllocatorEEPP11CUstream_stiP13cublasContext+0xf70) [0x7f0fa317a790] #3 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/lib/python3.7/site-packages/cuml/common/../../../../libcumlprims.so(_ZN8MLCommon5Stats3opg4meanERNS_6Matrix4DataIdEERKSt6vectorIPS4_SaIS7_EERKNS2_14PartDescriptorERKNS_16cumlCommunicatorESt10shared_ptrINS_15deviceAllocatorEEPP11CUstream_stiP13cublasContext+0x4e) [0x7f0fa317960e] #4 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/lib/python3.7/site-packages/cuml/common/../../../../libcumlprims.so(_ZN2ML3PCA3opg8fit_implIdEEvRNS_10cumlHandleERSt6vectorIPN8MLCommon6Matrix4DataIT_EESaISB_EERNS7_14PartDescriptorEPS9_SH_SH_SH_SH_SH_NS_9paramsPCAEPP11CUstream_stib+0x12a) [0x7f0fa312448a] #5 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/lib/python3.7/site-packages/cuml/common/../../../../libcumlprims.so(_ZN2ML3PCA3opg8fit_implIdEEvRNS_10cumlHandleEPPN8MLCommon6Matrix12RankSizePairEmPPNS6_4DataIT_EEPSB_SF_SF_SF_SF_SF_NS_9paramsPCAEb+0x284) [0x7f0fa3124a14] #6 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/lib/python3.7/site-packages/cuml/common/../../../../libcumlprims.so(_ZN2ML3PCA3opg3fitERNS_10cumlHandleEPPN8MLCommon6Matrix12RankSizePairEmPPNS5_4DataIdEEPdSD_SD_SD_SD_SD_NS_9paramsPCAEb+0x75) [0x7f0fa310b065] #7 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/lib/python3.7/site-packages/cuml/decomposition/pca_mg.cpython-37m-x86_64-linux-gnu.so(+0x969e) [0x7f0f8017269e] #8 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/lib/python3.7/site-packages/cuml/decomposition/pca_mg.cpython-37m-x86_64-linux-gnu.so(+0xad9b) [0x7f0f80173d9b] #9 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/lib/python3.7/site-packages/cuml/decomposition/base_mg.cpython-37m-x86_64-linux-gnu.so(+0x6210) [0x7f0f8015b210] #10 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/lib/python3.7/site-packages/cuml/decomposition/base_mg.cpython-37m-x86_64-linux-gnu.so(+0xbbcb) [0x7f0f80160bcb] #11 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/lib/python3.7/site-packages/cuml/decomposition/base_mg.cpython-37m-x86_64-linux-gnu.so(+0xe90d) [0x7f0f8016390d] #12 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(_PyObject_FastCallKeywords+0x15c) [0x562d053d6bec] #13 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(+0x181661) [0x562d053d7661] #14 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(_PyEval_EvalFrameDefault+0x48a2) [0x562d0541d762] #15 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(_PyFunction_FastCallDict+0x118) [0x562d0538e138] #16 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(_PyEval_EvalFrameDefault+0x1ccd) [0x562d0541ab8d] #17 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(_PyFunction_FastCallDict+0x118) [0x562d0538e138] #18 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(_PyEval_EvalFrameDefault+0x1ccd) [0x562d0541ab8d] #19 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(_PyFunction_FastCallKeywords+0x187) [0x562d0538ed37] #20 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(+0x181455) [0x562d053d7455] #21 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(_PyEval_EvalFrameDefault+0x611) [0x562d054194d1] #22 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(_PyFunction_FastCallDict+0x118) [0x562d0538e138] #23 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(_PyEval_EvalFrameDefault+0x1ccd) [0x562d0541ab8d] #24 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(_PyFunction_FastCallKeywords+0x187) [0x562d0538ed37] #25 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(+0x181455) [0x562d053d7455] #26 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(_PyEval_EvalFrameDefault+0x611) [0x562d054194d1] #27 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(_PyFunction_FastCallKeywords+0x187) [0x562d0538ed37] #28 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(+0x181455) [0x562d053d7455] #29 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(_PyEval_EvalFrameDefault+0x611) [0x562d054194d1] #30 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(_PyObject_FastCallDict+0x1b6) [0x562d05370ab6] #31 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(+0x12f071) [0x562d05385071] #32 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(PyObject_Call+0xb4) [0x562d05371214] #33 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(+0x2218e3) [0x562d054778e3] #34 in /datasets/pentschev/miniconda3/envs/rn-102-0.14/bin/python(+0x1dac27) [0x562d05430c27] #35 in /lib/x86_64-linux-gnu/libpthread.so.0(+0x76db) [0x7f10d5a326db] #36 in /lib/x86_64-linux-gnu/libc.so.6(clone+0x3f) [0x7f10d575b88f] ```

Any ideas what could be the issue here @dantegd @cjnolet ?

kkraus14 commented 4 years ago

"no kernel image is available for execution on the device" typically means that you're using an unsupported GPU architecture.

Could you dump the output of nvidia-smi here?

davidnoz123 commented 4 years ago

Thanks for your replies. I tried the nightly build but I got the same problem as before. Note that I'm using Tesla P100s, which I believe are supported? Below is my conda list and nvidia-smi

# packages in environment at /home/david/anaconda3:
#
# Name                    Version                   Build  Channel
_ipyw_jlab_nb_ext_conf    0.1.0                    py37_0
_libgcc_mutex             0.1                 conda_forge    conda-forge
_openmp_mutex             4.5                       0_gnu    conda-forge
aiohttp                   3.6.2            py37h516909a_0    conda-forge
anaconda-client           1.7.2                    py37_0
anaconda-navigator        1.9.12                   py37_0
appdirs                   1.4.3                      py_1    conda-forge
arrow-cpp                 0.15.0           py37h5ac5442_0    conda-forge
async-timeout             3.0.1                   py_1000    conda-forge
attrs                     19.3.0                     py_0
backcall                  0.1.0                    py37_0
backports                 1.0                        py_2
backports.functools_lru_cache 1.6.1                      py_0
backports.tempfile        1.0                        py_1
backports.weakref         1.0.post1                  py_1
beautifulsoup4            4.9.1                    py37_0
blas                      1.0                         mkl    anaconda
bleach                    3.1.4                      py_0
bokeh                     1.4.0            py37hc8dfbb8_1    conda-forge
boost                     1.70.0           py37h9de70de_1    conda-forge
boost-cpp                 1.70.0               h8e57a91_2    conda-forge
brotli                    1.0.7             he1b5a44_1002    conda-forge
bzip2                     1.0.8                h7b6447c_0
c-ares                    1.15.0            h516909a_1001    conda-forge
ca-certificates           2020.1.1                      0    anaconda
cairo                     1.16.0            hcf35c78_1003    conda-forge
certifi                   2020.4.5.1               py37_0    anaconda
cffi                      1.14.0           py37h2e261b9_0
cfitsio                   3.470                hb60a0a2_2    conda-forge
chardet                   3.0.4                 py37_1003
click                     7.1.2                      py_0
click-plugins             1.1.1                      py_0    conda-forge
cligj                     0.5.0                      py_0    conda-forge
cloudpickle               1.4.1                      py_0    conda-forge
clyent                    1.2.2                    py37_1
colorcet                  2.0.1                      py_0    conda-forge
conda                     4.8.3                    py37_0    anaconda
conda-build               3.18.11                  py37_0
conda-env                 2.6.0                         1
conda-package-handling    1.6.1            py37h7b6447c_0
conda-verify              3.4.2                      py_1
cryptography              2.9.2            py37h1ba5d50_0
cudatoolkit               10.1.243             h6bb024c_0    nvidia
cudf                      0.15.0a200612   py37_g046385d16_1192    rapidsai-nightly
cudnn                     7.6.0                cuda10.1_0    nvidia
cugraph                   0.15.0a200616   py37_gaeee8aae_274    rapidsai-nightly
cuml                      0.15.0a200616   cuda10.1_py37_g24812e323_458    rapidsai-nightly
cupy                      7.5.0            py37h0632833_0    conda-forge
curl                      7.68.0               hf8cf82a_0    conda-forge
cusignal                  0.15.0a200616           py37_62    rapidsai-nightly
cuspatial                 0.15.0a200615   py37_g091c1c1_128    rapidsai-nightly
cuxfilter                 0.15.0a200616           py37_86    rapidsai-nightly
cycler                    0.10.0                     py_2    conda-forge
cytoolz                   0.10.1           py37h516909a_0    conda-forge
dask                      2.18.1                     py_0    conda-forge
dask-core                 2.18.1                     py_0    conda-forge
dask-cuda                 0.15.0a200616           py37_16    rapidsai-nightly
dask-cudf                 0.15.0a200612   py37_ge5dd80b3f_1252    rapidsai-nightly
dask-xgboost              0.2.0.dev28      cuda10.1py37_0    rapidsai-nightly
datashader                0.10.0                     py_0    conda-forge
datashape                 0.5.4                      py_1    conda-forge
dbus                      1.13.14              hb2f20db_0
decorator                 4.4.2                      py_0
defusedxml                0.6.0                      py_0
dill                      0.3.1.1                  pypi_0    pypi
diskcache                 4.1.0                    pypi_0    pypi
distributed               2.18.0           py37hc8dfbb8_0    conda-forge
dlpack                    0.2                  he1b5a44_1    conda-forge
double-conversion         3.1.5                he1b5a44_2    conda-forge
entrypoints               0.3                      py37_0
expat                     2.2.6                he6710b0_0
fastavro                  0.23.4           py37h8f50634_0    conda-forge
fastrlock                 0.5              py37h3340039_0    conda-forge
filelock                  3.0.12                     py_0
fiona                     1.8.13           py37h900e953_0    conda-forge
fontconfig                2.13.1            h86ecdb6_1001    conda-forge
freetype                  2.9.1                h8a8886c_1
freexl                    1.0.5             h14c3975_1002    conda-forge
fsspec                    0.7.4                      py_0    conda-forge
future                    0.18.2                   py37_1
gdal                      3.0.2            py37hbb6b9fb_2    conda-forge
geopandas                 0.7.0                      py_1    conda-forge
geos                      3.7.2                he1b5a44_2    conda-forge
geotiff                   1.5.1                h32362d2_6    conda-forge
gflags                    2.2.2             he1b5a44_1002    conda-forge
giflib                    5.1.7                h516909a_1    conda-forge
glib                      2.63.1               h5a9c865_0
glob2                     0.7                        py_0
glog                      0.4.0                h49b9bf7_3    conda-forge
gmp                       6.1.2                h6c8ec71_1
google-auth               1.17.2                   pypi_0    pypi
grpc-cpp                  1.23.0               h18db393_0    conda-forge
grpcio                    1.29.0                   pypi_0    pypi
gst-plugins-base          1.14.5               h0935bb2_2    conda-forge
gstreamer                 1.14.5               h36ae1b5_2    conda-forge
hdf4                      4.2.13            hf30be14_1003    conda-forge
hdf5                      1.10.5          nompi_h3c11f04_1104    conda-forge
heapdict                  1.0.1                      py_0    conda-forge
horovod                   0.19.4                   pypi_0    pypi
icu                       64.2                 he1b5a44_1    conda-forge
idna                      2.9                        py_1
imagecodecs-lite          2019.12.3        py37h03ebfcd_1    conda-forge
imageio                   2.8.0                      py_0    conda-forge
importlib-metadata        1.6.0                    py37_0
importlib_metadata        1.6.0                         0
intel-openmp              2020.1                      217    anaconda
ipykernel                 5.1.4            py37h39e3cac_0
ipython                   7.13.0           py37h5ca1d4c_0
ipython_genutils          0.2.0                    py37_0
ipywidgets                7.5.1                      py_0
jedi                      0.17.0                   py37_0
jinja2                    2.11.2                     py_0
joblib                    0.15.1                     py_0    conda-forge
jpeg                      9d                   h516909a_0    conda-forge
json-c                    0.13.1            hbfbb72e_1002    conda-forge
json5                     0.9.5                      py_0
jsonschema                3.2.0                    py37_0
jupyter-server-proxy      1.5.0                      py_0    conda-forge
jupyter_client            6.1.3                      py_0
jupyter_core              4.6.3                    py37_0
jupyterlab                1.2.6              pyhf63ae98_0
jupyterlab-nvdashboard    0.3.1                    pypi_0    pypi
jupyterlab_server         1.1.4                      py_0
kealib                    1.4.13               hec59c27_0    conda-forge
keras-preprocessing       1.1.2                    pypi_0    pypi
kiwisolver                1.2.0            py37h99015e2_0    conda-forge
krb5                      1.16.4               h2fd8d38_0    conda-forge
ld_impl_linux-64          2.33.1               h53a641e_7
libarchive                3.3.3             hb44662c_1005    conda-forge
libcudf                   0.15.0a200612   cuda10.1_g046385d16_1192    rapidsai-nightly
libcugraph                0.15.0a200616   cuda10.1_gaeee8aae_274    rapidsai-nightly
libcuml                   0.15.0a200616   cuda10.1_g24812e323_458    rapidsai-nightly
libcumlprims              0.15.0a200607       cuda10.1_39    rapidsai-nightly
libcurl                   7.68.0               hda55be3_0    conda-forge
libcuspatial              0.15.0a200616   cuda10.1_g68a198e_135    rapidsai-nightly
libdap4                   3.20.4               hd3bb157_0    conda-forge
libedit                   3.1.20181209         hc058e9b_0
libevent                  2.1.10               h72c5cf5_0    conda-forge
libffi                    3.2.1                hd88cf55_4
libgcc-ng                 9.2.0                h24d8f2e_2    conda-forge
libgdal                   3.0.2                hc7cfd23_2    conda-forge
libgfortran-ng            7.3.0                hdf63c60_0    anaconda
libgomp                   9.2.0                h24d8f2e_2    conda-forge
libhwloc                  2.1.0                h3c4fd83_0    conda-forge
libiconv                  1.15              h516909a_1006    conda-forge
libkml                    1.3.0             h4fcabce_1010    conda-forge
liblief                   0.10.1               he6710b0_0
libllvm8                  8.0.1                hc9558a2_0    conda-forge
libnetcdf                 4.7.1           nompi_h94020b1_102    conda-forge
libnvstrings              0.15.0a200604   cuda10.1_gaeda0c0_774    rapidsai-nightly
libpng                    1.6.37               hbc83047_0
libpq                     11.5                 hd9ab2ff_2    conda-forge
libprotobuf               3.8.0                h8b12597_0    conda-forge
librmm                    0.15.0a200616   cuda10.1_g9f65999_191    rapidsai-nightly
libsodium                 1.0.16               h1bed415_0
libspatialindex           1.9.3                he1b5a44_3    conda-forge
libspatialite             4.3.0a            h4f6d029_1032    conda-forge
libssh2                   1.9.0                hab1572f_2    conda-forge
libstdcxx-ng              9.1.0                hdf63c60_0
libtiff                   4.0.10            h57b8799_1003    conda-forge
libuuid                   2.32.1            h14c3975_1000    conda-forge
libuv                     1.34.0               h516909a_0    conda-forge
libwebp                   1.0.2                h302c0c8_3    conda-forge
libxcb                    1.13                 h1bed415_1
libxgboost                1.1.0dev.rapidsai0.14      cuda10.1_0    rapidsai-nightly
libxml2                   2.9.10               hee79883_0    conda-forge
llvmlite                  0.32.1           py37h5202443_0    conda-forge
locket                    0.2.0                      py_2    conda-forge
lz4-c                     1.8.3             he1b5a44_1001    conda-forge
lzo                       2.10                 h7b6447c_2
markdown                  3.2.2                      py_0    conda-forge
markupsafe                1.1.1            py37h7b6447c_0
matplotlib-base           3.2.1            py37h30547a4_0    conda-forge
mistune                   0.8.4            py37h7b6447c_0
mkl                       2019.4                      243    anaconda
mkl-service               2.3.0            py37he904b0f_0    anaconda
mkl_fft                   1.0.15           py37ha843d7b_0    anaconda
mkl_random                1.1.0            py37hd6b4f25_0    anaconda
msgpack-python            1.0.0            py37h99015e2_1    conda-forge
multidict                 4.7.6                    pypi_0    pypi
multipledispatch          0.6.0                      py_0    conda-forge
munch                     2.5.0                      py_0    conda-forge
navigator-updater         0.2.1                    py37_0
nbconvert                 5.6.1                    py37_0
nbformat                  5.0.6                      py_0
nccl                      2.5.7.1              h51cf6c1_0    conda-forge
ncurses                   6.2                  he6710b0_1
networkx                  2.4                        py_1    conda-forge
nodejs                    13.13.0              hf5d1a2b_0    conda-forge
notebook                  6.0.3                    py37_0
numba                     0.49.1           py37h0573a6f_0
numpy                     1.18.1           py37h4f9e942_0    anaconda
numpy-base                1.18.1           py37hde5b4d6_1    anaconda
nvstrings                 0.15.0a200604   py37_gaeda0c0_774    rapidsai-nightly
olefile                   0.46                     py37_0
openjpeg                  2.3.1                h21c5421_1    conda-forge
openssl                   1.1.1g               h7b6447c_0    anaconda
opt-einsum                3.2.1                    pypi_0    pypi
packaging                 20.4               pyh9f0ad1d_0    conda-forge
pandas                    0.25.3           py37hb3f55d8_0    conda-forge
pandoc                    2.2.3.2                       0
pandocfilters             1.4.2                    py37_1
panel                     0.6.4                         0    conda-forge
param                     1.9.3                      py_0    conda-forge
parquet-cpp               1.5.1                         2    conda-forge
parso                     0.7.0                      py_0
partd                     1.1.0                      py_0    conda-forge
patchelf                  0.10                 he6710b0_0
pcre                      8.43                 he6710b0_0
petastorm                 0.9.2                    pypi_0    pypi
pexpect                   4.8.0                    py37_0
pickleshare               0.7.5                    py37_0
pillow                    6.2.1            py37h6b7be26_0    conda-forge
pip                       20.0.2                   py37_3
pixman                    0.38.0            h516909a_1003    conda-forge
pkginfo                   1.5.0.1                  py37_0
plotly                    4.8.1                      py_0
poppler                   0.67.0               h14e79db_8    conda-forge
poppler-data              0.4.9                         1    conda-forge
postgresql                11.5                 hc63931a_2    conda-forge
proj                      6.2.1                hc80f0dc_0    conda-forge
prometheus_client         0.7.1                      py_0
prompt-toolkit            3.0.5                      py_0
prompt_toolkit            3.0.5                         0
protobuf                  3.12.2                   pypi_0    pypi
psutil                    5.7.0            py37h7b6447c_0
ptyprocess                0.6.0                    py37_0
py-lief                   0.10.1           py37h403a769_0
py-xgboost                1.1.0dev.rapidsai0.14  cuda10.1py37_0    rapidsai-nightly
pyarrow                   0.17.1                   pypi_0    pypi
pyasn1-modules            0.2.8                    pypi_0    pypi
pycosat                   0.6.3            py37h7b6447c_0
pycparser                 2.20                       py_0
pyct                      0.4.6                      py_0    conda-forge
pyct-core                 0.4.6                      py_0    conda-forge
pyee                      7.0.2              pyh9f0ad1d_0    conda-forge
pygments                  2.6.1                      py_0
pynvml                    8.0.4                      py_0    conda-forge
pyopenssl                 19.1.0                   py37_0
pyparsing                 2.4.7              pyh9f0ad1d_0    conda-forge
pyppeteer                 0.0.25                     py_1    conda-forge
pyproj                    2.4.2.post1      py37h12732c1_0    conda-forge
pyqt                      5.9.2            py37h05f1152_2
pyrsistent                0.16.0           py37h7b6447c_0
pysocks                   1.7.1                    py37_0
python                    3.7.6           cpython_h8356626_6    conda-forge
python-dateutil           2.8.1                      py_0
python-libarchive-c       2.9                        py_0
python-snappy             0.5.4            py37he6710b0_0
python_abi                3.7                     1_cp37m    conda-forge
pytz                      2020.1                     py_0
pyviz_comms               0.7.5              pyh9f0ad1d_0    conda-forge
pywavelets                1.1.1            py37h03ebfcd_1    conda-forge
pyyaml                    5.3.1            py37h7b6447c_0
pyzmq                     18.1.1           py37he6710b0_0
qt                        5.9.7                h0c104cb_3    conda-forge
qtpy                      1.9.0                      py_0
rapids                    0.15.0          cuda10.1_py37_182    rapidsai-nightly
rapids-xgboost            0.15.0          cuda10.1_py37_182    rapidsai-nightly
re2                       2020.04.01           he1b5a44_0    conda-forge
readline                  8.0                  h7b6447c_0
requests                  2.23.0                   py37_0
retrying                  1.3.3                    py37_2
ripgrep                   11.0.2               he32d670_0
rmm                       0.15.0a200616   py37_g9f65999_191    rapidsai-nightly
rsa                       4.6                      pypi_0    pypi
rtree                     0.9.4            py37h8526d28_1    conda-forge
ruamel_yaml               0.15.87          py37h7b6447c_0
scikit-image              0.17.2           py37h0da4684_1    conda-forge
scikit-learn              0.22.1           py37hd81dba3_0
scipy                     1.4.1            py37h0b6359f_0
send2trash                1.5.0                    py37_0
setuptools                45.2.0                   py37_0
shapely                   1.6.4           py37hec07ddf_1006    conda-forge
simpervisor               0.3                        py_1    conda-forge
sip                       4.19.8           py37hf484d3e_0
six                       1.15.0                     py_0
snappy                    1.1.7                hbae5bb6_3
sortedcontainers          2.2.2              pyh9f0ad1d_0    conda-forge
soupsieve                 2.0.1                      py_0
spdlog                    1.6.1                hc9558a2_0    conda-forge
sqlite                    3.31.1               h62c20be_1
tbb                       2018.0.5             h2d50403_0    conda-forge
tblib                     1.6.0                      py_0    conda-forge
tensorboard               2.2.2                    pypi_0    pypi
tensorboard-plugin-wit    1.6.0.post3              pypi_0    pypi
termcolor                 1.1.0                    pypi_0    pypi
terminado                 0.8.3                    py37_0
testpath                  0.4.4                      py_0
thrift-cpp                0.12.0            hf3afdfd_1004    conda-forge
tifffile                  2020.6.3                   py_0    conda-forge
tiledb                    1.6.2                h69c774e_1    conda-forge
tk                        8.6.10               hed695b0_0    conda-forge
toolz                     0.10.0                     py_0    conda-forge
tornado                   6.0.4            py37h7b6447c_1
tqdm                      4.46.0                     py_0
traitlets                 4.3.3                    py37_0
tzcode                    2020a                h516909a_0    conda-forge
ucx                       1.8.0+g49982d4      cuda10.1_25    rapidsai-nightly
ucx-py                    0.15.0a200616+g49982d4         py37_69    rapidsai-nightly
uriparser                 0.9.3                he1b5a44_1    conda-forge
urllib3                   1.25.8                   py37_0
wcwidth                   0.1.9                      py_0
webencodings              0.5.1                    py37_1
websockets                8.1              py37h8f50634_1    conda-forge
wheel                     0.34.2                   py37_0
widgetsnbextension        3.5.1                    py37_0
xarray                    0.15.1                     py_0    conda-forge
xerces-c                  3.2.2             h8412b87_1004    conda-forge
xgboost                   1.1.0dev.rapidsai0.14  cuda10.1py37_0    rapidsai-nightly
xmltodict                 0.12.0                     py_0
xorg-kbproto              1.0.7             h14c3975_1002    conda-forge
xorg-libice               1.0.10               h516909a_0    conda-forge
xorg-libsm                1.2.3             h84519dc_1000    conda-forge
xorg-libx11               1.6.9                h516909a_0    conda-forge
xorg-libxext              1.3.4                h516909a_0    conda-forge
xorg-libxrender           0.9.10            h516909a_1002    conda-forge
xorg-renderproto          0.11.1            h14c3975_1002    conda-forge
xorg-xextproto            7.3.0             h14c3975_1002    conda-forge
xorg-xproto               7.0.31            h14c3975_1007    conda-forge
xz                        5.2.5                h7b6447c_0
yaml                      0.1.7                had09818_2
yarl                      1.4.2                    pypi_0    pypi
zeromq                    4.3.1                he6710b0_3
zict                      2.0.0                      py_0    conda-forge
zipp                      3.1.0                      py_0
zlib                      1.2.11               h7b6447c_3
zstd                      1.4.0                h3b9ef0a_0    conda-forge
(base) david@poc-gpu-host:~/antuit_demand_forecasting/antuit$ nvidia-smi
Tue Jun 16 18:27:09 2020
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.126.02   Driver Version: 418.126.02   CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla P100-PCIE...  Off  | 00000001:00:00.0 Off |                  Off |
| N/A   35C    P0    26W / 250W |    154MiB / 16280MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  Tesla P100-PCIE...  Off  | 00000002:00:00.0 Off |                  Off |
| N/A   32C    P0    25W / 250W |      0MiB / 16280MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      2661      G   /usr/lib/xorg/Xorg                            65MiB |
|    0      2787      G   /usr/bin/gnome-shell                          88MiB |
+-----------------------------------------------------------------------------+
JohnZed commented 4 years ago

This appears to be an issue with the compilation options for the cumlprims supporting library in 0.14, which is compiled for newer GPUs. We are working on a fix now. Thank you for reporting this!

davidnoz123 commented 4 years ago

All good! Thanks for taking time to look into it!

dantegd commented 4 years ago

@davidnoz123 a new libcumlprims version (0.14.1) has been released for the release version that has fixed the issue. For the nightly 0.15 version there are upcoming packages that should be fixed as well in the next couple of days at the latest.

davidnoz123 commented 4 years ago

Wow! You guys do fast service! Thanks loads! ( :

pentschev commented 4 years ago

@davidnoz123 thanks for reporting the issue, I'll tentatively close this as it's probably resolved, feel free to reopen if you still experience problems after updating libcumlprims.

davidnoz123 commented 4 years ago

Tested and works with a new conda environment and:-

conda install -c rapidsai -c nvidia -c conda-forge -c defaults rapids=0.14 python=3.7 cudatoolkit=10.1

on:-

Linux poc-gpu-host 5.3.0-1028-azure #29~18.04.1-Ubuntu SMP Fri Jun 5 14:32:34 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

with:-

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.126.02   Driver Version: 418.126.02   CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla P100-PCIE...  Off  | 00000001:00:00.0 Off |                  Off |
| N/A   35C    P0    26W / 250W |    153MiB / 16280MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  Tesla P100-PCIE...  Off  | 00000002:00:00.0 Off |                  Off |
| N/A   32C    P0    25W / 250W |      0MiB / 16280MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      2661      G   /usr/lib/xorg/Xorg                            65MiB |
|    0      2787      G   /usr/bin/gnome-shell                          88MiB |
+-----------------------------------------------------------------------------+

Great guys! Thanks again!