rapidsai / cuml

cuML - RAPIDS Machine Learning Library
https://docs.rapids.ai/api/cuml/stable/
Apache License 2.0
4.18k stars 527 forks source link

[QST] The error message is printed, but i can't solve this problem. Please help me. #5525

Open seo-jaeyong opened 1 year ago

seo-jaeyong commented 1 year ago

"I am working with cuML in Colab.

This get the RAPIDS-Colab install files and test check your GPU. Run this and the next cell only.

Please read the output of this cell. If your Colab Instance is not RAPIDS compatible, it will warn you and give you remediation steps.

!git clone https://github.com/rapidsai/rapidsai-csp-utils.git !python rapidsai-csp-utils/colab/pip-install.py

!/usr/bin/env python

import os, sys, io import subprocess from pathlib import Path

CFFI fix with pip

output = subprocess.Popen(["pip uninstall --yes cffi"], shell=True, stderr=subprocess.STDOUT, stdout=subprocess.PIPE) for line in io.TextIOWrapper(output.stdout, encoding="utf-8"): if(line == ""): break else: print(line.rstrip()) output = subprocess.Popen(["pip uninstall --yes cryptography"], shell=True, stderr=subprocess.STDOUT, stdout=subprocess.PIPE) for line in io.TextIOWrapper(output.stdout, encoding="utf-8"): if(line == ""): break else: print(line.rstrip()) output = subprocess.Popen(["pip install cffi==1.15.0"], shell=True, stderr=subprocess.STDOUT, stdout=subprocess.PIPE) for line in io.TextIOWrapper(output.stdout, encoding="utf-8"): if(line == ""): break else: print(line.rstrip())

Install RAPIDS

pkg = "rapids" if(sys.argv[1] == "nightly"): release = ["rapidsai-nightly", "23.06"] print("Installing RAPIDS Nightly "+release[1]) else: release = ["rapidsai", "23.04"] print("Installing RAPIDS Stable "+release[1])

pkg = "rapids" print("Starting the RAPIDS install on Colab. This will take about 15 minutes.")

output = subprocess.Popen(["conda install -y --prefix /usr/local -c "+release[0]+" -c nvidia -c conda-forge python=3.10 cudatoolkit=11.8 "+pkg+"="+release[1]+" llvmlite gcsfs openssl dask-sql"], shell=True, stderr=subprocess.STDOUT, stdout=subprocess.PIPE) for line in io.TextIOWrapper(output.stdout, encoding="utf-8"): if(line == ""): break else: print(line.rstrip())

print("RAPIDS conda installation complete. Updating Colab's libraries...") import sys, os, shutil sys.path.append('/usr/local/lib/python3.10/site-packages/') os.environ['NUMBAPRO_NVVM'] = '/usr/local/cuda/nvvm/lib64/libnvvm.so' os.environ['NUMBAPRO_LIBDEVICE'] = '/usr/local/cuda/nvvm/libdevice/'

os.environ["CONDA_PREFIX"] = "/usr/local" for so in ['cudf', 'rmm', 'nccl', 'cuml', 'cugraph', 'xgboost', 'cuspatial', 'cupy', 'geos','geos_c']: fn = 'lib'+so+'.so' source_fn = '/usr/local/lib/'+fn dest_fn = '/usr/lib/'+fn if os.path.exists(source_fn): print(f'Copying {source_fn} to {dest_fn}') shutil.copyfile(source_fn, dest_fn)

**The code above was for environment setup,**

rom cuml.neighbors import KNeighborsClassifier as cuKNeighborsClassifier

from cuml.svm import SVC as cuSVC from cuml.ensemble import RandomForestClassifier as cuRandomForestClassifier from cuml.metrics import accuracy_score, roc_auc_score import cupy as cp from tqdm import tqdm

Define the models

print("Defining models...") knn = cuKNeighborsClassifier() svm = cuSVC(probability=True, cache_size = 200) rf = cuRandomForestClassifier()

Prepare a list of tasks

tasks = [(knn, "KNN"), (svm, "SVM"), (rf, "Random Forest")]

Loop over the tasks

for model, name in tqdm(tasks, desc="Training and predicting models"):

Fit the model on our data

print(f"Training {name}...")
model.fit(X_train, y_train.values)

# Make predictions with the model
print(f"Making predictions with {name}...")
pred_proba = model.predict_proba(X_test)

# Average the predictions (if there are multiple models)
# If this is the first model, initialize avg_pred_proba
if 'avg_pred_proba' not in locals():
    avg_pred_proba = pred_proba
else:  # if it's not the first model, average with the existing avg_pred_proba
    avg_pred_proba = cp.mean(cp.array([avg_pred_proba, pred_proba]), axis=0)

Make final predictions by selecting the class with highest average probability

print("Making final predictions...") y_pred = cp.argmax(avg_pred_proba, axis=1)

Move predictions to host

print("Moving predictions to host memory...") y_pred = cp.asnumpy(y_pred)

print("All done!")

Defining models... Training and predicting models: 0%| | 0/3 [00:00<?, ?it/s]Training KNN... Making predictions with KNN... Training and predicting models: 33%|███▎ | 1/3 [03:01<06:03, 181.58s/it]Training SVM... Training and predicting models: 33%|███▎ | 1/3 [48:02<1:36:05, 2882.54s/it]

RuntimeError Traceback (most recent call last) in <cell line: 20>() 21 # Fit the model on our data 22 print(f"Training {name}...") ---> 23 model.fit(X_train, y_train.values) 24 25 # Make predictions with the model

10 frames svc.pyx in cuml.svm.svc.SVC.fit()

svc.pyx in cuml.svm.svc.SVC._fit_proba()

svc.pyx in cuml.svm.svc.SVC._fit_proba()

svc.pyx in cuml.svm.svc.SVC.decision_function()

/usr/local/lib/python3.10/dist-packages/cuml/internals/api_decorators.py in wrapper(*args, kwargs) 186 187 if process_return: --> 188 ret = func(*args, *kwargs) 189 else: 190 return func(args, kwargs)

svm_base.pyx in cuml.svm.svm_base.SVMBase.predict()

RuntimeError: cuBLAS error encountered at: file=/__w/cuml/cuml/python/_skbuild/linux-x86_64-3.10/cmake-build/_deps/raft-src/cpp/include/raft/linalg/detail/gemv.hpp line=47: call='detail::cublasgemv(cublas_h, trans_a ? CUBLAS_OP_T : CUBLAS_OP_N, m, n, alpha, A, lda, x, incx, beta, y, incy, stream)', Reason=13:CUBLAS_STATUS_EXECUTION_FAILED Obtained 64 stack frames

0 in /usr/local/lib/python3.10/dist-packages/cuml/internals/../libcuml++.so(_ZN4raft9exception18collect_call_stackEv+0x81) [0x7a023405fe91]

1 in /usr/local/lib/python3.10/dist-packages/cuml/internals/../libcuml++.so(_ZN4raft12cublas_errorC1ERKSs+0x14b) [0x7a023406166b]

2 in /usr/local/lib/python3.10/dist-packages/cuml/internals/../libcuml++.so(_ZN4raft6linalg6detail4gemvIfLb0EEEvRKNS_9resourcesEbiiPKT_S8_iS8_iS8_PS6_iP11CUstream_st+0x337) [0x7a02344acc97]

3 in /usr/local/lib/python3.10/dist-packages/cuml/internals/../libcuml++.so(_ZN2ML3SVM11svcPredictXIfNSt12experimental6mdspanIfNS2_7extentsIiJLm18446744073709551615ELm18446744073709551615EEEENS2_13layout_strideEN4raft20host_device_accessorINS2_16default_accessorIfEELNS7_11memory_typeE1EEEEEEEvRKNS7_8handle_tET0_iiRNS7_8distance7kernels12KernelParamsERKNS0_8SvmModelIT_EEPSN_SN_b+0x5b6) [0x7a0234bab506]

4 in /usr/local/lib/python3.10/dist-packages/cuml/internals/../libcuml++.so(_ZN2ML3SVM10svcPredictIfEEvRKN4raft8handle_tEPT_iiRNS2_8distance7kernels12KernelParamsERKNS0_8SvmModelIS6_EES7_S6_b+0x49) [0x7a0234bac089]

5 in /usr/local/lib/python3.10/dist-packages/cuml/svm/svm_base.cpython-310-x86_64-linux-gnu.so(+0x338d3) [0x7a022e7538d3]

6 in /usr/bin/python3(PyObject_Call+0xbb) [0x565afc2dbb5b]

7 in /usr/bin/python3(_PyEval_EvalFrameDefault+0x2a37) [0x565afc2b7d87]

8 in /usr/bin/python3(_PyFunction_Vectorcall+0x7c) [0x565afc2cd4ec]

9 in /usr/local/lib/python3.10/dist-packages/cuml/svm/svc.cpython-310-x86_64-linux-gnu.so(+0x2cece) [0x7a022e8f9ece]

10 in /usr/local/lib/python3.10/dist-packages/cuml/svm/svc.cpython-310-x86_64-linux-gnu.so(+0x32895) [0x7a022e8ff895]

11 in /usr/bin/python3(PyObject_Call+0xbb) [0x565afc2dbb5b]

12 in /usr/bin/python3(_PyEval_EvalFrameDefault+0x2a37) [0x565afc2b7d87]

13 in /usr/bin/python3(+0x16af11) [0x565afc2daf11]

14 in /usr/bin/python3(_PyEval_EvalFrameDefault+0x1a1b) [0x565afc2b6d6b]

15 in /usr/bin/python3(_PyFunction_Vectorcall+0x7c) [0x565afc2cd4ec]

16 in /usr/bin/python3(_PyEval_EvalFrameDefault+0x6cd) [0x565afc2b5a1d]

17 in /usr/bin/python3(_PyFunction_Vectorcall+0x7c) [0x565afc2cd4ec]

18 in /usr/bin/python3(PyObject_Call+0x122) [0x565afc2dbbc2]

19 in /usr/bin/python3(_PyEval_EvalFrameDefault+0x2a37) [0x565afc2b7d87]

20 in /usr/bin/python3(_PyFunction_Vectorcall+0x7c) [0x565afc2cd4ec]

21 in /usr/bin/python3(_PyObject_FastCallDictTstate+0x16d) [0x565afc2c295d]

22 in /usr/bin/python3(_PyObject_Call_Prepend+0x5c) [0x565afc2d7f9c]

23 in /usr/bin/python3(+0x285050) [0x565afc3f5050]

24 in /usr/bin/python3(PyObject_Call+0xbb) [0x565afc2dbb5b]

25 in /usr/bin/python3(_PyEval_EvalFrameDefault+0x2a37) [0x565afc2b7d87]

26 in /usr/bin/python3(+0x17a312) [0x565afc2ea312]

27 in /usr/bin/python3(+0x1802e4) [0x565afc2f02e4]

28 in /usr/bin/python3(+0x1d52d6) [0x565afc3452d6]

29 in /usr/bin/python3(_PyEval_EvalFrameDefault+0x6cd) [0x565afc2b5a1d]

30 in /usr/bin/python3(+0x16af11) [0x565afc2daf11]

31 in /usr/bin/python3(_PyEval_EvalFrameDefault+0x640a) [0x565afc2bb75a]

32 in /usr/bin/python3(_PyObject_FastCallDictTstate+0xc4) [0x565afc2c28b4]

33 in /usr/bin/python3(_PyObject_Call_Prepend+0x5c) [0x565afc2d7f9c]

34 in /usr/bin/python3(+0x285050) [0x565afc3f5050]

35 in /usr/bin/python3(_PyObject_MakeTpCall+0x25b) [0x565afc2c372b]

36 in /usr/bin/python3(_PyEval_EvalFrameDefault+0x67dc) [0x565afc2bbb2c]

37 in /usr/bin/python3(_PyFunction_Vectorcall+0x7c) [0x565afc2cd4ec]

38 in /usr/local/lib/python3.10/dist-packages/cuml/svm/svc.cpython-310-x86_64-linux-gnu.so(+0x2cece) [0x7a022e8f9ece]

39 in /usr/local/lib/python3.10/dist-packages/cuml/svm/svc.cpython-310-x86_64-linux-gnu.so(+0x44e68) [0x7a022e911e68]

40 in /usr/bin/python3(PyObject_Call+0xbb) [0x565afc2dbb5b]

41 in /usr/bin/python3(_PyEval_EvalFrameDefault+0x2a37) [0x565afc2b7d87]

42 in /usr/bin/python3(_PyFunction_Vectorcall+0x7c) [0x565afc2cd4ec]

43 in /usr/local/lib/python3.10/dist-packages/cuml/svm/svc.cpython-310-x86_64-linux-gnu.so(+0x2cece) [0x7a022e8f9ece]

44 in /usr/local/lib/python3.10/dist-packages/cuml/svm/svc.cpython-310-x86_64-linux-gnu.so(+0x39038) [0x7a022e906038]

45 in /usr/bin/python3(PyObject_Call+0xbb) [0x565afc2dbb5b]

46 in /usr/bin/python3(_PyEval_EvalFrameDefault+0x2a37) [0x565afc2b7d87]

47 in /usr/bin/python3(+0x16af11) [0x565afc2daf11]

48 in /usr/bin/python3(_PyEval_EvalFrameDefault+0x640a) [0x565afc2bb75a]

49 in /usr/bin/python3(+0x142176) [0x565afc2b2176]

50 in /usr/bin/python3(PyEval_EvalCode+0x86) [0x565afc3a7c56]

51 in /usr/bin/python3(+0x23d93d) [0x565afc3ad93d]

52 in /usr/bin/python3(+0x15d749) [0x565afc2cd749]

53 in /usr/bin/python3(_PyEval_EvalFrameDefault+0x6cd) [0x565afc2b5a1d]

54 in /usr/bin/python3(+0x17a640) [0x565afc2ea640]

55 in /usr/bin/python3(_PyEval_EvalFrameDefault+0x287f) [0x565afc2b7bcf]

56 in /usr/bin/python3(+0x17a640) [0x565afc2ea640]

57 in /usr/bin/python3(_PyEval_EvalFrameDefault+0x287f) [0x565afc2b7bcf]

58 in /usr/bin/python3(+0x17a640) [0x565afc2ea640]

59 in /usr/bin/python3(+0x25985f) [0x565afc3c985f]

60 in /usr/bin/python3(+0x1689fa) [0x565afc2d89fa]

61 in /usr/bin/python3(_PyEval_EvalFrameDefault+0x8c4) [0x565afc2b5c14]

62 in /usr/bin/python3(_PyFunction_Vectorcall+0x7c) [0x565afc2cd4ec]

63 in /usr/bin/python3(_PyEval_EvalFrameDefault+0x6cd) [0x565afc2b5a1d]

the following code includes the code I've written and the error messages. I'm trying to debug, but I suspect it's an issue with the environment setup, hence the inquiry. Thank you."

dantegd commented 1 year ago

Hi @seo-jaeyong, wanted to update status here, was trying to reproduce but haven't been succesful, so have reached to other folks that might be able to help triage any issues of cuML in colab.