Open webfrank opened 1 week ago
Thanks, and I'm glad it's working, at least somewhat!
I would expect any of the initialization functions to return an error if CUDA was not actually initialized correctly... odd. Have you verified that the library runs correctly by running go test -v -bench=.
from its source directory? You'd need to set the ONNXRUNTIME_SHARED_LIBRARY_PATH
environment variable to point to your GPU-enabled copy of onnxruntime.so in order to run the test, but this should give you good information on whether CUDA is enabled and working properly. (You'd specifically want to look at the output for the BenchmarkCUDASession
output, and make sure it's faster than the BenchmarkOpMultiThreaded
output.)
Depending on the size of the yolov8 network, it's possible that it's just not large enough to see a significant benefit from CUDA, especially with CUDA's higher overheads. However, it is indeed puzzling that nvidia-smi isn't showing anything. I've seen the current version of onnxruntime_go
interact correctly with CUDA on several different systems, so I wonder if you're somehow just loading a wrong copy of the library? Let me know if the tests pass.
And sorry for the slow update, I haven't had much time to look at this project recently.
Hi. sorry for late reply. I managed to use CUDA upgrading bindings and library to latest versions. Inference time is about 10ms on a AWS Tesla T4 but from nvidia-smi there are no GPU bound processes. If I disable CUDA provider, same hardware, I got 120ms inference time so I suppose it is using GPU but no evidence.
Hi, great works, flawless integration with Go.
I was trying to move inference on CUDA device. This is the code used to initialize the runtime:
The library is the latest runtime (1.19.2) from official repo, the GPU variant.
Inference is working but with similar time of CPU, the output from
nvidia-smi
is this: