Closed et-blanc closed 2 years ago
Have you installed NVIDIA CUDA and CUDNN? If so, what version of cuda? Can you also share what DJL gradle/maven dependencies you are using?
Yes, I have installed NVIDIA CUDA and CUDNN. My version of cuda is 11.4. You can see below my dependencies:
<dependencyManagement>
<dependencies>
<dependency>
<groupId>ai.djl</groupId>
<artifactId>bom</artifactId>
<version>0.15.0</version>
<type>pom</type>
<scope>import</scope>
</dependency>
</dependencies>
</dependencyManagement>
<dependencies>
<!-- https://mvnrepository.com/artifact/net.java.dev.jna/jna -->
<dependency>
<groupId>net.java.dev.jna</groupId>
<artifactId>jna</artifactId>
<version>5.9.0</version>
</dependency>
<dependency>
<groupId>ai.djl</groupId>
<artifactId>api</artifactId>
</dependency>
<dependency>
<groupId>ai.djl</groupId>
<artifactId>basicdataset</artifactId>
</dependency>
<dependency>
<groupId>ai.djl</groupId>
<artifactId>model-zoo</artifactId>
</dependency>
<dependency>
<groupId>ai.djl.sentencepiece</groupId>
<artifactId>sentencepiece</artifactId>
</dependency>
<dependency>
<groupId>ai.djl.pytorch</groupId>
<artifactId>pytorch-engine</artifactId>
</dependency>
<dependency>
<groupId>ai.djl.pytorch</groupId>
<artifactId>pytorch-model-zoo</artifactId>
</dependency>
<!-- https://mvnrepository.com/artifact/ai.djl.pytorch/pytorch-jni -->
<dependency>
<groupId>ai.djl.pytorch</groupId>
<artifactId>pytorch-jni</artifactId>
<version>1.10.0-0.15.0</version>
</dependency>
<dependency>
<groupId>ai.djl.pytorch</groupId>
<artifactId>pytorch-native-cu113</artifactId>
<classifier>linux-x86_64</classifier>
<version>1.10.0</version>
<scope>runtime</scope>
</dependency>
<!-- https://mvnrepository.com/artifact/ai.djl.huggingface/tokenizers -->
<dependency>
<groupId>ai.djl.huggingface</groupId>
<artifactId>tokenizers</artifactId>
<version>0.15.0</version>
</dependency>
<!-- https://mvnrepository.com/artifact/args4j/args4j -->
<dependency>
<groupId>args4j</groupId>
<artifactId>args4j</artifactId>
<version>2.33</version>
</dependency>
</dependencies>
@et-blanc Can you try cuda 11.3?
CudaUtils
trying to load libcudart.so
file, it seems it's not found in LD_LIBRARY_PATH, or your cuda driver and cuda runtime point to different folder.
Can you check which version the follow command return:
nvcc --version
One more thing you can try, install python version of pytorch and see if it can pick up GPU.
The python version of pytorch can pick up GPU and the command nvcc --version
returns :
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Wed_Jul_22_19:09:09_PDT_2020
Cuda compilation tools, release 11.0, V11.0.221
Build cuda_11.0_bu.TC445_37.28845127_0
It seems that the error comes from my version of CUDA too recent. Is it possible to use DJL with CUDA 11.5 or 11.6? If not, will it be soon?
@et-blanc DJL PyTorch 1.10.0 should work with CUDA 11.*
Can you check a few things:
nvidia-smi -l
libcudart.so
exists and set properly in LD_LIBRARY_PATH
cd djl
./gradlew debugEngine -Dai.djl.default_engine=PyTorch
The command nvidia-smi -l
returns:
The file libcudart.so
exists and is located at ./usr/local/cuda-11.0/targets/x86_64-linux/lib/libcudart.so
.
The command echo $LD_LIBRARY_PATH
returns:
/usr/local/cuda/lib64:
/usr/local/cuda-11.0/lib64:
Finally, the command ./gradlew debugEngine -Dai.djl.default_engine=PyTorch
returns :
your libcudart.so
comes from cuda-11.0, but your nvidia-smi
show your cuda is 11.5, something is wrong in your system.
The command should be ./gradlew debugEnv
, sorry for providing a wrong command to you.
And you djl seems in an old version, we have upgraded gradle to 7.2 already. Please get latest code and try the command.
Feel free to reopen this issue if you still facing the problem
Description
Hi,
I'm trying to make inference with my GPU (NVIDIA GeForce RTX 3090) using DJL 0.15.0. However, I can't load any model on my GPU.
I'm new to working with Java and DJL, so any help is very much appreciated.
Thank you.
Error Message
How to Reproduce?
Steps to reproduce
What have you tried to solve it?
It seems that DJL doesn't detect my GPU: