deepjavalibrary / djl

An Engine-Agnostic Deep Learning Framework in Java
https://djl.ai
Apache License 2.0
4.09k stars 650 forks source link

Support for Radeon GPUs? #3115

Closed viggy96 closed 3 months ago

viggy96 commented 5 months ago

Description

Support Radeon GPUs to be used for accelerating inferencing and training

Will this change the current api? How? No

Who will benefit from this enhancement? All users of Radeon GPUs

viggy96 commented 5 months ago

I've tried using the PyTorch ROCm version from here https://repo.radeon.com/rocm/manylinux/rocm-rel-6.0/README.html And it does work according to these validation instructions: https://rocm.docs.amd.com/projects/radeon/en/latest/docs/install/install-pytorch.html#verify-pytorch-installation

However I get the following error when running my project:

[pool-1-thread-1] WARN ai.djl.util.Platform - The bundled library: cu121-linux-x86_64:2.1.1-20231129 doesn't match system: cpu-linux-x86_64:2.1.1
[pool-1-thread-1] INFO ai.djl.util.Platform - Ignore mismatching platform from: jar:file:/home/vignesh/.gradle/caches/modules-2/files-2.1/ai.djl.pytorch/pytorch-native-cu121/2.1.1/fe8e6fa55e25294ae61c9832c029d5dddbd759aa/pytorch-native-cu121-2.1.1-linux-x86_64.jar!/native/lib/pytorch.properties
[pool-1-thread-1] INFO ai.djl.util.Platform - Found matching platform from: jar:file:/home/vignesh/.gradle/caches/modules-2/files-2.1/ai.djl.pytorch/pytorch-native-cpu/2.1.1/2625b85275629071b06b0f7f27822e03257dffa0/pytorch-native-cpu-2.1.1-linux-x86_64.jar!/native/lib/pytorch.properties
OpenJDK 64-Bit Server VM warning: You have loaded library /home/vignesh/.local/lib/python3.11/site-packages/torch/lib/libamdhip64.so which might have disabled stack guard. The VM will try to fix the stack guard now.
It's highly recommended that you fix the library with 'execstack -c <libfile>', or link it with '-z noexecstack'.
amdgpu.ids: No such file or directory
terminate called after throwing an instance of 'std::runtime_error'
  what():  Invalid ext op lib path
viggy96 commented 5 months ago

Has DJL ever used Radeon GPUs?

frankfliu commented 5 months ago

@viggy96

We don't support ROCm, you can try to build PyTorch JNI for ROCm by yourself. See: https://github.com/deepjavalibrary/djl/blob/master/engines/pytorch/pytorch-native/build.sh#L26

frankfliu commented 5 months ago

@viggy96

You actually can use DJL with ROCm using OnnxRuntime engine, see: https://github.com/deepjavalibrary/djl/blob/master/engines/onnxruntime/onnxruntime-engine/src/main/java/ai/djl/onnxruntime/engine/OrtModel.java#L212-L213

viggy96 commented 5 months ago

That sounds great, do you have any object detection inference examples using OnnxRuntime?

frankfliu commented 5 months ago

https://github.com/deepjavalibrary/djl/blob/master/examples/src/main/java/ai/djl/examples/inference/Yolov8Detection.java