mars-project / mars

Mars is a tensor-based unified framework for large-scale data computation which scales numpy, pandas, scikit-learn and Python functions.
https://mars-project.readthedocs.io
Apache License 2.0
2.68k stars 325 forks source link

[BUG] throw a `No NVIDIA GPU detected` warning or a similar error when deploying Mars on Ray #3334

Closed dlee992 closed 10 months ago

dlee992 commented 1 year ago

As title, if we want to run some GPU related computations using Mars on a Ray cluster, and we using the mode of Mars-on-Ray or Mars-on-Ray-DAG, following warnings will be threw when importing cuDF:

warning:
/home/admin/miniconda3/envs/rapids-23.02/lib/python3.8/site-packages/cudf/utils/gpu_utils.py:148: UserWarning: No NVIDIA GPU detected  
warnings.warn("No NVIDIA GPU detected")

Or an error will be threw by Numba cuda module when importing cudf.DataFrame:

CudaSupportError: Error at driver init: [100] Call to cuInit results in CUDA_ERROR_NO_DEVICE