deepjavalibrary / djl

An Engine-Agnostic Deep Learning Framework in Java
https://djl.ai
Apache License 2.0
4.13k stars 656 forks source link

No matching CUDA flavor for linux found: cu118/sm_75 #2130

Closed reuschling closed 1 year ago

reuschling commented 2 years ago

On my (Arch) Linux machine, I want to use pre-trained UniversalSentenceEncoder from the Tensorflow model hub. This works nicely on CPU.

On a different machine with GPU I try to accelerate this, but get this warning:

WARN: No matching CUDA flavor for linux found: cu118/sm_75, fallback to CPU. << i.djl.tensorflow.engine.javacpp.LibUtils

Tensorflow version used by DJL (Output of TensorFlow.version()): 2.7.1

Nvidia libs installed on the machine: cudnn 8.5.0.96-1, cuda 11.8.0-1, cuda-tools 11.8.0-1

Djl-libs from my pom.xml:

        <dependency>
            <groupId>ai.djl</groupId>
            <artifactId>api</artifactId>
            <version>0.19.0</version>
        </dependency>

        <dependency>
            <groupId>ai.djl.tensorflow</groupId>
            <artifactId>tensorflow-engine</artifactId>
            <version>0.19.0</version>
            <scope>runtime</scope>
        </dependency>

        <dependency>
            <groupId>org.tensorflow</groupId>
            <artifactId>tensorflow-core-platform</artifactId>
            <version>0.4.1</version>
        </dependency>

Is there a problem with the installed cuda/cudnn version, or is the installation correct and found, but Djl/Tensorflow doesn't find a matching Gpu-Tensorflow version for 'cu118/sm_75' to download? How to solve this? Thanks a lot.

reuschling commented 2 years ago

In the code of ai.djl.tensorflow.engine.javacpp.LibUtils I can see that Djl has no matching Tensorflow version under 'https://publish.djl.ai/tensorflow-*'. Thus, I see two possibilities:

  1. Changing the OS Cuda version to something Djl offers for download
  2. Pointing Djl to a self-installed Tensorflow version with the TENSORFLOW_LIBRARY_PATH environment variable

For the solution 1: Can I somewhere determine which versions are available at https://publish.djl.ai?

frankfliu commented 2 years ago

@reuschling DJL TensorFlow 2.7.x is compiled against CUDA 11.3, please switch to CUDA 11.3 to use TensorFlow GPU.

DJL doesn't directly depends tensorflow-core-platform package, you don't need to add it in your pom.xml file. As you already looked into DJL's code, DJL will auto detect your system, if no matching CUDA found, DJL will switch to CPU. You can also bundle a os specific dependency: https://github.com/deepjavalibrary/djl/blob/master/engines/tensorflow/tensorflow-engine/README.md#linux-gpu

frankfliu commented 2 years ago

@reuschling You can also find maven package that DJL published: https://search.maven.org/search?q=g:ai.djl.tensorflow

reuschling commented 1 year ago

Thanks a lot, I was able to get it working with the Cuda 11.0 version