NVIDIA / GPUStressTest

GPU Stress Test is a tool to stress the compute engine of NVIDIA Tesla GPU’s by running a BLAS matrix multiply using different data types. It can be compiled and run on both Linux and Windows.
Other
76 stars 12 forks source link

./gst: error while loading shared libraries: libcublasLt.so.12: cannot open shared object file: No such file or directory #7

Open gim4moon opened 9 months ago

gim4moon commented 9 months ago

hello. I received the following message during GPU stress test in BCM10 environment.

The CUDA toolkit is installed as 12.3, and this phenomenon occurred while running the gst file after cmake and make.

Are there any new environment variables that need to be added?

root@node001:~/GPUStressTest# ./gst ./gst: error while loading shared libraries: libcublasLt.so.12: cannot open shared object file: No such file or directory

gim4moon commented 9 months ago

ldconfig Afterwards, when you run ./gst again, the screen below appears and it fails.

root@node001:~/GPUStressTest# ./gst ./gst capturing GPU information... WATCHDOG starting, TIMEOUT: 600 seconds Detected 1 CUDA Capable device(s) Device 0: "NVIDIA A100-PCIE-40GB" Initilizing A100 40 GB based test suite GPU Memory: 39, memgb: 40

Device 0: "NVIDIA A100-PCIE-40GB", PCIe: 3

***** STARTING TEST 0: INT8 On Device 0 NVIDIA A100-PCIE-40GB

math_type 10

args: matrixSizeA 26187688266 matrixSizeB 11047658931 matrixSizeC 2782772334

args: ta=N tb=T m=81218 n=34263 k=322437 lda=2598976 ldb=1096704 ldc=2598976 loop=10

std::exception: an illegal memory access was encountered std::exception: an illegal memory access was encountered testing cublasLt fail