ika-rwth-aachen / docker-ros-ml-images

Machine Learning-Enabled ROS Docker Images
MIT License
58 stars 8 forks source link

Support for ONNX Runtime? #11

Open BaderTim opened 1 week ago

BaderTim commented 1 week ago

Hi there, thanks for the amazing collection!
Are there any plans to include support for the ONNX runtime with either CUDA or TensorRT execution providers?

lreiher commented 5 days ago

Good idea to add ONNX RT. :)

I have created a quick PR https://github.com/ika-rwth-aachen/docker-ros-ml-images/pull/12 that additionally installs ONNX RT into the -ml images. Before merging, we still need to test this out a little. If you already have an ONNX use case, would you mind testing the installation by just using rwthika/ros2-ml:humble (or similar), running pip3 install onnxruntime-gpu==1.20.1 and then testing your use case?

We are planning a new release before the end of the year, anyway, and would then integrate this PR into that one.

BaderTim commented 5 days ago

@lreiher thank you for your quick response. With your instructions and my use case, i receive the following message when creating an inference session:

2024-11-25 11:18:05.499424728 [E:onnxruntime:Default, provider_bridge_ort.cc:1848 TryGetProviderInfo_TensorRT] /onnxruntime_src/onnxruntime/core/session/provider_bridge_ort.cc:1539 onnxruntime::Provider& onnxruntime::ProviderLibrary::Get() [ONNXRuntimeError] : 1 : FAIL : Failed to load library libonnxruntime_providers_tensorrt.so with error: libnvinfer.so.10: cannot open shared object file: No such file or directory

*************** EP Error ***************
EP Error /onnxruntime_src/onnxruntime/python/onnxruntime_pybind_state.cc:507 void onnxruntime::python::RegisterTensorRTPluginsAsCustomOps(PySessionOptions&, const onnxruntime::ProviderOptions&) Please install TensorRT libraries as mentioned in the GPU requirements page, make sure they're in the PATH or LD_LIBRARY_PATH, and that your GPU is supported.
 when using [('TensorrtExecutionProvider', {'trt_engine_cache_enable': False, 'trt_fp16_enable': True, 'device_id': 0, 'trt_layer_norm_fp32_fallback': True, 'trt_dla_enable': False, 'trt_detailed_build_log': False, 'trt_builder_optimization_level': 1}), ('CUDAExecutionProvider', {'cudnn_conv_use_max_workspace': '0', 'device_id': 0})]
Falling back to ['CUDAExecutionProvider', 'CPUExecutionProvider'] and retrying.
****************************************
2024-11-25 11:20:49.166983822 [E:onnxruntime:Default, provider_bridge_ort.cc:1862 TryGetProviderInfo_CUDA] /onnxruntime_src/onnxruntime/core/session/provider_bridge_ort.cc:1539 onnxruntime::Provider& onnxruntime::ProviderLibrary::Get() [ONNXRuntimeError] : 1 : FAIL : Failed to load library libonnxruntime_providers_cuda.so with error: libcudnn.so.9: cannot open shared object file: No such file or directory  

2024-11-25 11:20:49.167019460 [W:onnxruntime:Default, onnxruntime_pybind_state.cc:993 CreateExecutionProviderInstance] Failed to create CUDAExecutionProvider. Require cuDNN 9.* and CUDA 12.*. Please install all dependencies as mentioned in the GPU requirements page (https://onnxruntime.ai/docs/execution-providers/CUDA-ExecutionProvider.html#requirements), make sure they're in the PATH, and that your GPU is supported.
lreiher commented 5 days ago

Okay, guess we'll have to come up with a quick sample script to test ONNX RT (or perhaps you could even share yours?) ourselves and try to fix this issue.

lreiher commented 5 days ago

See https://github.com/ika-rwth-aachen/docker-ros-ml-images/pull/12#discussion_r1857245840, further discussion should take place there.