DeltaGroupNJUPT / Vina-GPU-2.1

Vina-GPU 2.1, an improved docking toolkit for faster speed and higher accuracy on the virtual screening
Apache License 2.0
53 stars 15 forks source link

"Cannot find any NVIDIA platform" error #25

Open uniquelemon opened 5 months ago

uniquelemon commented 5 months ago

Hi, I am running AutoDock on WSL2 with CUDA version 12.4, and I have tried Boost versions 1.77, 1.83 and 1.85. However, I encounter the error 'Cannot find any NVIDIA platform' during tests. Since my WSL2 environment can normally run AlphaFold2, and both nvidia-smi and clinfo commands output information correctly,

$  ./AutoDock-Vina-GPU-2-1 --config ./input_file_example/2bm2_config.txt
#################################################################
# If you used AutoDockVina-GPU 2.1 in your work, please cite:   #
#                                                               #
# Ding, Ji, et al. Vina-GPU 2.0: Further Accelerating AutoDock  #
# Vina and Its Derivatives with Graphics Processing Units.      #
# Journal of Chemical Information and Modeling (2023).          #
#                                                               #
# DOI https://doi.org/10.1021/acs.jcim.2c01504                  #
#                                                               #
# Shidi, Tang, Chen Ruiqi, Lin Mengru, Lin Qingde,              #
# Zhu Yanxiang, Wu Jiansheng, Hu Haifeng, and Ling Ming.        #
# Accelerating AutoDock Vina with GPUs.                         #
# Molecules 27.9 (2022): 3041.                                  #
#                                                               #
# DOI https://doi.org/10.3390/molecules27093041                 #
#                                                               #
# And also the origin AutoDock Vina paper:                      #
# O. Trott, A. J. Olson,                                        #
# AutoDock Vina: improving the speed and accuracy of docking    #
# with a new scoring function, efficient optimization and       #
# multithreading, Journal of Computational Chemistry 31 (2010)  #
# 455-461                                                       #
#                                                               #
# DOI 10.1002/jcc.21334                                         #
#                                                               #
#################################################################

Using virtual sreening mode

Output will be in the directory ./test_out
Reading input ... done.
Setting up the scoring function ... done.
Using heuristic search_depth
Analyzing the binding site ... done.
Cannot find any NVIDIA platform

$ clinfo Number of platforms 1 Platform Name Portable Computing Language Platform Vendor The pocl project Platform Version OpenCL 3.0 PoCL 5.0 Linux, RelWithDebInfo, RELOC, SPIR, LLVM 14.0.0, SLEEF, CUDA, POCL_DEBUG Platform Profile FULL_PROFILE Platform Extensions cl_khr_icd cl_pocl_content_size Platform Extensions with Version cl_khr_icd 0x400000 (1.0.0) cl_pocl_content_size 0x400000 (1.0.0) Platform Numeric Version 0xc00000 (3.0.0) Platform Extensions function suffix POCL Platform Host timer resolution 0ns

Platform Name Portable Computing Language Number of devices 1 Device Name NVIDIA GeForce RTX 3090 Device Vendor NVIDIA Corporation Device Vendor ID 0x10de Device Version OpenCL 3.0 PoCL HSTR: CUDA-sm_86 Device Numeric Version 0xc00000 (3.0.0) Driver Version 5.0 Device OpenCL C Version OpenCL C 1.2 PoCL Device OpenCL C all versions OpenCL C 0x400000 (1.0.0) OpenCL C 0x401000 (1.1.0) OpenCL C 0x402000 (1.2.0) OpenCL C 0xc00000 (3.0.0) Device OpenCL C features opencl_c_images 0xc00000 (3.0.0) opencl_c_atomic_order_acq_rel 0xc00000 (3.0.0) opencl_c_atomic_order_seq_cst 0xc00000 (3.0.0) opencl_c_atomic_scope_device 0xc00000 (3.0.0) opencl_c_program_scope_global_variables 0xc00000 (3.0.0) opencl_c_generic_address_space 0xc00000 (3.0.0) opencl_c_fp16 0xc00000 (3.0.0) opencl_c_fp64 0xc00000 (3.0.0) Latest comfornace test passed (n/a) Device Type GPU Device Topology (NV) PCI-E, 0000:08:00.0 Device Profile FULL_PROFILE Device Available Yes Compiler Available Yes Linker Available Yes Max compute units 82 Max clock frequency 1860MHz Compute Capability (NV) 8.6 Device Partition (core) Max number of sub-devices 1 Supported partition types None Supported affinity domains (n/a) Max work item dimensions 3 Max work item sizes 1024x1024x64 Max work group size 1024 Preferred work group size multiple (device) 32 Preferred work group size multiple (kernel) 32 Warp size (NV) 32 Max sub-groups per work group 32 Preferred / native vector sizes char 1 / 1 short 1 / 1 int 1 / 1 long 1 / 1 half 0 / 0 (cl_khr_fp16) float 1 / 1 double 1 / 1 (cl_khr_fp64) Half-precision Floating-point support (cl_khr_fp16) Denormals No Infinity and NANs No Round to nearest No Round to zero No Round to infinity No IEEE754-2008 fused multiply-add No Support is emulated in software No Single-precision Floating-point support (core) Denormals Yes Infinity and NANs Yes Round to nearest Yes Round to zero Yes Round to infinity Yes IEEE754-2008 fused multiply-add Yes Support is emulated in software No Correctly-rounded divide and sqrt operations No Double-precision Floating-point support (cl_khr_fp64) Denormals Yes Infinity and NANs Yes Round to nearest Yes Round to zero Yes Round to infinity Yes IEEE754-2008 fused multiply-add Yes Support is emulated in software No Address bits 64, Little-Endian Global memory size 25769148416 (24GiB) Error Correction support No Max memory allocation 6442287104 (6GiB) Unified memory for Host and Device No Integrated memory (NV) No Shared Virtual Memory (SVM) capabilities (core) Coarse-grained buffer sharing Yes Fine-grained buffer sharing Yes Fine-grained system sharing No Atomics No Minimum alignment for any data type 128 bytes Alignment of base address 4096 bits (512 bytes) Preferred alignment for atomics SVM 64 bytes Global 64 bytes Local 64 bytes Atomic memory capabilities relaxed, work-group scope Atomic fence capabilities relaxed, acquire/release, work-group scope Max size for global variable 0 Preferred total size of global vars 0 Global Memory cache type None Image support No Pipe support No Max number of pipe args 0 Max active pipe reservations 0 Max pipe packet size 0 Local memory type Local Local memory size 49152 (48KiB) Registers per block (NV) 65536 Max number of constant args 8 Max constant buffer size 65536 (64KiB) Generic address space support Yes Max size of kernel argument 4352 (4.25KiB) Queue properties (on host) Out-of-order execution No Profiling Yes Device enqueue capabilities (n/a) Queue properties (on device) Out-of-order execution No Profiling No Preferred size 0 Max size 0 Max queues on device 0 Max events on device 0 Prefer user sync for interop Yes Profiling timer resolution 1ns Execution capabilities Run OpenCL kernels Yes Run native kernels No Non-uniform work-groups No Work-group collective functions No Sub-group independent forward progress Yes Kernel execution timeout (NV) Yes Concurrent copy and kernel execution (NV) Yes Number of async copy engines 1 IL version (n/a) ILs with version (n/a) SPIR versions (n/a) printf() buffer size 16777216 (16MiB) Built-in kernels pocl.mul.i32;pocl.add.i32;pocl.dnn.conv2d_int8_relu;pocl.sgemm.local.f32;pocl.sgemm.tensor.f16f16f32;pocl.sgemm_ab.tensor.f16f16f32;pocl.abs.f32;pocl.add.i8;org.khronos.openvx.scale_image.nn.u8;org.khronos.openvx.scale_image.bl.u8;org.khronos.openvx.tensor_convert_depth.wrap.u8.f32 Built-in kernels with version pocl.mul.i32 0x402000 (1.2.0) pocl.add.i32 0x402000 (1.2.0) pocl.dnn.conv2d_int8_relu 0x402000 (1.2.0) pocl.sgemm.local.f32 0x402000 (1.2.0) pocl.sgemm.tensor.f16f16f32 0x402000 (1.2.0) pocl.sgemm_ab.tensor.f16f16f32 0x402000 (1.2.0) pocl.abs.f32 0x402000 (1.2.0) pocl.add.i8 0x402000 (1.2.0) org.khronos.openvx.scale_image.nn.u8 0x402000 (1.2.0) org.khronos.openvx.scale_image.bl.u8 0x402000 (1.2.0) org.khronos.openvx.tensor_convert_depth.wrap.u8.f32 0x402000 (1.2.0) Device Extensions cl_khr_byte_addressable_store cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_nv_device_attribute_query cl_khr_spir cl_khr_fp16 cl_khr_fp64 Device Extensions with Version cl_khr_byte_addressable_store 0x400000 (1.0.0) cl_khr_global_int32_base_atomics 0x400000 (1.0.0) cl_khr_global_int32_extended_atomics 0x400000 (1.0.0) cl_khr_local_int32_base_atomics 0x400000 (1.0.0) cl_khr_local_int32_extended_atomics 0x400000 (1.0.0) cl_khr_int64_base_atomics 0x400000 (1.0.0) cl_khr_int64_extended_atomics 0x400000 (1.0.0) cl_nv_device_attribute_query 0x400000 (1.0.0) cl_khr_spir 0x801000 (2.1.0) cl_khr_fp16 0x400000 (1.0.0) cl_khr_fp64 0x400000 (1.0.0)

NULL platform behavior clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...) No platform clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...) No platform clCreateContext(NULL, ...) [default] No platform clCreateContext(NULL, ...) [other] Success [POCL] clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT) Success (1) Platform Name Portable Computing Language Device Name NVIDIA GeForce RTX 3090 clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU) No devices found in platform clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU) Success (1) Platform Name Portable Computing Language Device Name NVIDIA GeForce RTX 3090 clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR) No devices found in platform clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM) No devices found in platform clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL) Success (1) Platform Name Portable Computing Language Device Name NVIDIA GeForce RTX 3090 (base) zouyu@B550 ~/07.Software/AutoDock/Vina-GPU-2.1-main/AutoDock-Vina-GPU-2.1 $ clinfo Number of platforms 1 Platform Name Portable Computing Language Platform Vendor The pocl project Platform Version OpenCL 3.0 PoCL 5.0 Linux, RelWithDebInfo, RELOC, SPIR, LLVM 14.0.0, SLEEF, CUDA, POCL_DEBUG Platform Profile FULL_PROFILE Platform Extensions cl_khr_icd cl_pocl_content_size Platform Extensions with Version cl_khr_icd 0x400000 (1.0.0) cl_pocl_content_size 0x400000 (1.0.0) Platform Numeric Version 0xc00000 (3.0.0) Platform Extensions function suffix POCL Platform Host timer resolution 0ns

Platform Name Portable Computing Language Number of devices 1 Device Name NVIDIA GeForce RTX 3090 Device Vendor NVIDIA Corporation Device Vendor ID 0x10de Device Version OpenCL 3.0 PoCL HSTR: CUDA-sm_86 Device Numeric Version 0xc00000 (3.0.0) Driver Version 5.0 Device OpenCL C Version OpenCL C 1.2 PoCL Device OpenCL C all versions OpenCL C 0x400000 (1.0.0) OpenCL C 0x401000 (1.1.0) OpenCL C 0x402000 (1.2.0) OpenCL C 0xc00000 (3.0.0) Device OpenCL C features opencl_c_images 0xc00000 (3.0.0) opencl_c_atomic_order_acq_rel 0xc00000 (3.0.0) opencl_c_atomic_order_seq_cst 0xc00000 (3.0.0) opencl_c_atomic_scope_device 0xc00000 (3.0.0) opencl_c_program_scope_global_variables 0xc00000 (3.0.0) opencl_c_generic_address_space 0xc00000 (3.0.0) opencl_c_fp16 0xc00000 (3.0.0) opencl_c_fp64 0xc00000 (3.0.0) Latest comfornace test passed (n/a) Device Type GPU Device Topology (NV) PCI-E, 0000:08:00.0 Device Profile FULL_PROFILE Device Available Yes Compiler Available Yes Linker Available Yes Max compute units 82 Max clock frequency 1860MHz Compute Capability (NV) 8.6 Device Partition (core) Max number of sub-devices 1 Supported partition types None Supported affinity domains (n/a) Max work item dimensions 3 Max work item sizes 1024x1024x64 Max work group size 1024 Preferred work group size multiple (device) 32 Preferred work group size multiple (kernel) 32 Warp size (NV) 32 Max sub-groups per work group 32 Preferred / native vector sizes char 1 / 1 short 1 / 1 int 1 / 1 long 1 / 1 half 0 / 0 (cl_khr_fp16) float 1 / 1 double 1 / 1 (cl_khr_fp64) Half-precision Floating-point support (cl_khr_fp16) Denormals No Infinity and NANs No Round to nearest No Round to zero No Round to infinity No IEEE754-2008 fused multiply-add No Support is emulated in software No Single-precision Floating-point support (core) Denormals Yes Infinity and NANs Yes Round to nearest Yes Round to zero Yes Round to infinity Yes IEEE754-2008 fused multiply-add Yes Support is emulated in software No Correctly-rounded divide and sqrt operations No Double-precision Floating-point support (cl_khr_fp64) Denormals Yes Infinity and NANs Yes Round to nearest Yes Round to zero Yes Round to infinity Yes IEEE754-2008 fused multiply-add Yes Support is emulated in software No Address bits 64, Little-Endian Global memory size 25769148416 (24GiB) Error Correction support No Max memory allocation 6442287104 (6GiB) Unified memory for Host and Device No Integrated memory (NV) No Shared Virtual Memory (SVM) capabilities (core) Coarse-grained buffer sharing Yes Fine-grained buffer sharing Yes Fine-grained system sharing No Atomics No Minimum alignment for any data type 128 bytes Alignment of base address 4096 bits (512 bytes) Preferred alignment for atomics SVM 64 bytes Global 64 bytes Local 64 bytes Atomic memory capabilities relaxed, work-group scope Atomic fence capabilities relaxed, acquire/release, work-group scope Max size for global variable 0 Preferred total size of global vars 0 Global Memory cache type None Image support No Pipe support No Max number of pipe args 0 Max active pipe reservations 0 Max pipe packet size 0 Local memory type Local Local memory size 49152 (48KiB) Registers per block (NV) 65536 Max number of constant args 8 Max constant buffer size 65536 (64KiB) Generic address space support Yes Max size of kernel argument 4352 (4.25KiB) Queue properties (on host) Out-of-order execution No Profiling Yes Device enqueue capabilities (n/a) Queue properties (on device) Out-of-order execution No Profiling No Preferred size 0 Max size 0 Max queues on device 0 Max events on device 0 Prefer user sync for interop Yes Profiling timer resolution 1ns Execution capabilities Run OpenCL kernels Yes Run native kernels No Non-uniform work-groups No Work-group collective functions No Sub-group independent forward progress Yes Kernel execution timeout (NV) Yes Concurrent copy and kernel execution (NV) Yes Number of async copy engines 1 IL version (n/a) ILs with version (n/a) SPIR versions (n/a) printf() buffer size 16777216 (16MiB) Built-in kernels pocl.mul.i32;pocl.add.i32;pocl.dnn.conv2d_int8_relu;pocl.sgemm.local.f32;pocl.sgemm.tensor.f16f16f32;pocl.sgemm_ab.tensor.f16f16f32;pocl.abs.f32;pocl.add.i8;org.khronos.openvx.scale_image.nn.u8;org.khronos.openvx.scale_image.bl.u8;org.khronos.openvx.tensor_convert_depth.wrap.u8.f32 Built-in kernels with version pocl.mul.i32 0x402000 (1.2.0) pocl.add.i32 0x402000 (1.2.0) pocl.dnn.conv2d_int8_relu 0x402000 (1.2.0) pocl.sgemm.local.f32 0x402000 (1.2.0) pocl.sgemm.tensor.f16f16f32 0x402000 (1.2.0) pocl.sgemm_ab.tensor.f16f16f32 0x402000 (1.2.0) pocl.abs.f32 0x402000 (1.2.0) pocl.add.i8 0x402000 (1.2.0) org.khronos.openvx.scale_image.nn.u8 0x402000 (1.2.0) org.khronos.openvx.scale_image.bl.u8 0x402000 (1.2.0) org.khronos.openvx.tensor_convert_depth.wrap.u8.f32 0x402000 (1.2.0) Device Extensions cl_khr_byte_addressable_store cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_nv_device_attribute_query cl_khr_spir cl_khr_fp16 cl_khr_fp64 Device Extensions with Version cl_khr_byte_addressable_store 0x400000 (1.0.0) cl_khr_global_int32_base_atomics 0x400000 (1.0.0) cl_khr_global_int32_extended_atomics 0x400000 (1.0.0) cl_khr_local_int32_base_atomics 0x400000 (1.0.0) cl_khr_local_int32_extended_atomics 0x400000 (1.0.0) cl_khr_int64_base_atomics 0x400000 (1.0.0) cl_khr_int64_extended_atomics 0x400000 (1.0.0) cl_nv_device_attribute_query 0x400000 (1.0.0) cl_khr_spir 0x801000 (2.1.0) cl_khr_fp16 0x400000 (1.0.0) cl_khr_fp64 0x400000 (1.0.0)

NULL platform behavior clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...) No platform clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...) No platform clCreateContext(NULL, ...) [default] No platform clCreateContext(NULL, ...) [other] Success [POCL] clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT) Success (1) Platform Name Portable Computing Language Device Name NVIDIA GeForce RTX 3090 clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU) No devices found in platform clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU) Success (1) Platform Name Portable Computing Language Device Name NVIDIA GeForce RTX 3090 clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR) No devices found in platform clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM) No devices found in platform clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL) Success (1) Platform Name Portable Computing Language Device Name NVIDIA GeForce RTX 3090

$  nvidia-smi
Sun Apr 28 20:36:38 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.76.01              Driver Version: 552.22         CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 3090        On  |   00000000:08:00.0  On |                  N/A |
| 51%   55C    P0            122W /  370W |    1665MiB /  24576MiB |     25%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+

I am quite puzzled. Have any suggestions? Thank you,

Lee