NVIDIA / warp

A Python framework for high performance GPU simulation and graphics
https://nvidia.github.io/warp/
Other
1.75k stars 148 forks source link

Unable to determine CUDA driver version #219

Open fertiliz opened 2 weeks ago

fertiliz commented 2 weeks ago

(foundationpose) robot@robot-System-Product-Name:~/CODE/FoundationPose$ python Python 3.9.19 (main, May 6 2024, 19:43:03) [GCC 11.2.0] :: Anaconda, Inc. on linux Type "help", "copyright", "credits" or "license" for more information.

import warp as wp wp.init() Warp CUDA error: Failed to get driver entry point 'cuDriverGetVersion' (CUDA error 34) Warp CUDA warning: Unable to determine CUDA driver version Warp CUDA error: Failed to get driver entry point 'cuGetErrorString' (CUDA error 34) Warp CUDA error: Failed to get driver entry point 'cuGetErrorName' (CUDA error 34) Warp CUDA error: Failed to get driver entry point 'cuInit' (CUDA error 34) Warp CUDA error: Failed to get driver entry point 'cuDeviceGet' (CUDA error 34) Warp CUDA error: Failed to get driver entry point 'cuDeviceGetCount' (CUDA error 34) Warp CUDA error: Failed to get driver entry point 'cuDeviceGetName' (CUDA error 34) Warp CUDA error: Failed to get driver entry point 'cuDeviceGetAttribute' (CUDA error 34) Warp CUDA error: Failed to get driver entry point 'cuDeviceGetUuid' (CUDA error 34) Warp CUDA error: Failed to get driver entry point 'cuDevicePrimaryCtxRetain' (CUDA error 34) Warp CUDA error: Failed to get driver entry point 'cuDevicePrimaryCtxRelease' (CUDA error 34) Warp CUDA error: Failed to get driver entry point 'cuDeviceCanAccessPeer' (CUDA error 34) Warp CUDA error: Failed to get driver entry point 'cuMemGetInfo' (CUDA error 34) Warp CUDA error: Failed to get driver entry point 'cuCtxSetCurrent' (CUDA error 34) Warp CUDA error: Failed to get driver entry point 'cuCtxGetCurrent' (CUDA error 34) Warp CUDA error: Failed to get driver entry point 'cuCtxPushCurrent' (CUDA error 34) Warp CUDA error: Failed to get driver entry point 'cuCtxPopCurrent' (CUDA error 34) Warp CUDA error: Failed to get driver entry point 'cuCtxSynchronize' (CUDA error 34) Warp CUDA error: Failed to get driver entry point 'cuCtxGetDevice' (CUDA error 34) Warp CUDA error: Failed to get driver entry point 'cuCtxCreate' (CUDA error 34) Warp CUDA error: Failed to get driver entry point 'cuCtxDestroy' (CUDA error 34) Warp CUDA error: Failed to get driver entry point 'cuCtxEnablePeerAccess' (CUDA error 34) Warp CUDA error: Failed to get driver entry point 'cuCtxDisablePeerAccess' (CUDA error 34) Warp CUDA error: Failed to get driver entry point 'cuStreamCreate' (CUDA error 34) Warp CUDA error: Failed to get driver entry point 'cuStreamDestroy' (CUDA error 34) Warp CUDA error: Failed to get driver entry point 'cuStreamSynchronize' (CUDA error 34) Warp CUDA error: Failed to get driver entry point 'cuStreamWaitEvent' (CUDA error 34) Warp CUDA error: Failed to get driver entry point 'cuStreamGetCtx' (CUDA error 34) Warp CUDA error: Failed to get driver entry point 'cuStreamGetCaptureInfo' (CUDA error 34) Warp CUDA error: Failed to get driver entry point 'cuStreamUpdateCaptureDependencies' (CUDA error 34) Warp CUDA error: Failed to get driver entry point 'cuEventCreate' (CUDA error 34) Warp CUDA error: Failed to get driver entry point 'cuEventDestroy' (CUDA error 34) Warp CUDA error: Failed to get driver entry point 'cuEventRecord' (CUDA error 34) Warp CUDA error: Failed to get driver entry point 'cuEventRecordWithFlags' (CUDA error 34) Warp CUDA error: Failed to get driver entry point 'cuEventSynchronize' (CUDA error 34) Warp CUDA error: Failed to get driver entry point 'cuModuleLoadDataEx' (CUDA error 34) Warp CUDA error: Failed to get driver entry point 'cuModuleUnload' (CUDA error 34) Warp CUDA error: Failed to get driver entry point 'cuModuleGetFunction' (CUDA error 34) Warp CUDA error: Failed to get driver entry point 'cuLaunchKernel' (CUDA error 34) Warp CUDA error: Failed to get driver entry point 'cuMemcpyPeerAsync' (CUDA error 34) Warp CUDA error: Failed to get driver entry point 'cuPointerGetAttribute' (CUDA error 34) Warp CUDA error: Failed to get driver entry point 'cuGraphicsMapResources' (CUDA error 34) Warp CUDA error: Failed to get driver entry point 'cuGraphicsUnmapResources' (CUDA error 34) Warp CUDA error: Failed to get driver entry point 'cuGraphicsResourceGetMappedPointer' (CUDA error 34) Warp CUDA error: Failed to get driver entry point 'cuGraphicsGLRegisterBuffer' (CUDA error 34) Warp CUDA error: Failed to get driver entry point 'cuGraphicsUnregisterResource' (CUDA error 34) Warp 1.1.0 initialized: CUDA devices not available Devices: "cpu" : "x86_64" Kernel cache: /home/robot/.cache/warp/1.1.0

My nvcc -V is: nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2022 NVIDIA Corporation Built on Wed_Sep_21_10:33:58_PDT_2022 Cuda compilation tools, release 11.8, V11.8.89 Build cuda_11.8.r11.8/compiler.31833905_0

zhihou7 commented 2 weeks ago

I got the same error within apptainer.

May I ask where warp find the libs ? (the path)

my cuda is cuda_11.3.r11.3/compiler.29920130_0 my warp is 1.0.0

Looking forward to the reply,

Best,

c0d1f1ed commented 2 weeks ago

Thank you for reporting this. We just load libcuda.so dynamically using dlopen(), without specifying a directory.

The error you're getting is defined as:

     * This indicates that the CUDA driver that the application has loaded is a
     * stub library. Applications that run with the stub rather than a real
     * driver loaded will result in CUDA API returning this error.
     */
    CUDA_ERROR_STUB_LIBRARY                   = 34,

The stub library is essentially an empty version of libcuda.so that comes with the CUDA Toolkit and that people can link their applications against so they don't actually need to have the driver installed on the build system; only the system on which the software will get deployed needs to have the real library under a path searched by dlopen().

So it appears that your system has the directory of the stub (typically /usr/local/cuda/lib64/stubs) configured to be searched before the user-mode driver.

zhihou7 commented 2 weeks ago

thanks for your reply. Do you have any suggestions to solve it? reinstall the cuda?or change the ld path?

best

zhi hou


From: Nicolas Capens @.> Sent: Friday, May 17, 2024 1:36:33 AM To: NVIDIA/warp @.> Cc: Zhi Hou @.>; Comment @.> Subject: Re: [NVIDIA/warp] Unable to determine CUDA driver version (Issue #219)

Thank you for reporting this. We just load libcuda.so dynamically using dlopen(), without specifying a directory.

The error you're getting is defined as:

 * This indicates that the CUDA driver that the application has loaded is a
 * stub library. Applications that run with the stub rather than a real
 * driver loaded will result in CUDA API returning this error.
 */
CUDA_ERROR_STUB_LIBRARY                   = 34,

The stub library is essentially an empty version of libcuda.so that comes with the CUDA Toolkit and that people can link their applications against so they don't actually need to have the driver installed on the build system; only the system on which the software will get deployed needs to have the real library under a path searched by dlopen().

So it appears that your system has the directory of the stub (typically /usr/local/cuda/lib64/stubs) configured to be searched before the user-mode driver.

— Reply to this email directly, view it on GitHubhttps://github.com/NVIDIA/warp/issues/219#issuecomment-2115836403, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AQGPLYDAYNOHDITQ2O7QWZ3ZCTVCDAVCNFSM6AAAAABHZWQXC2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMJVHAZTMNBQGM. You are receiving this because you commented.Message ID: @.***>

maohaotian commented 2 days ago

how to fix it? same question