microsoft / onnxruntime

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
https://onnxruntime.ai
MIT License
14.86k stars 2.94k forks source link

[Build] Cuda Execution Provider library is needed despite we only use TensoRT Execution provider #22960

Open jcdatin opened 5 days ago

jcdatin commented 5 days ago

Describe the issue

There must be a way to build onnxruntime with tensorRt without the cuda execution provider and its cuda unused dependencies. libonnxruntime_providers_cuda.so is big (220MB) and is dragging other big dependencies like libcufft or libcublas that we don't use in inference (another 400MB).

Urgency

non blocking

Target platform

linux

Build script

build.py

Error / output

N/A

Visual Studio Version

N/A

GCC / Compiler Version

gcc11

skottmckay commented 5 days ago

https://github.com/microsoft/onnxruntime/blob/b930b4ab5bfd86eef3009ddfa52229e4b86c5e16/cmake/CMakeLists.txt#L90

Add this to your build command line --cmake_extra_defines onnxruntime_CUDA_MINIMAL=ON

jcdatin commented 4 days ago

thank you ! (I could not find it in build.py) Trying this.

jcdatin commented 4 days ago

unfortunately this does not work , there is still cuda dependencies that caused compilation errors [ 35%] Building CXX object CMakeFiles/onnxruntime_providers_cuda.dir/onnxruntime/onnxruntime/contrib_ops/cuda/bert/decoder_attention.cc.o In file included from /onnxruntime/onnxruntime/contrib_ops/cuda/bert/attention.cc:5: /onnxruntime/onnxruntime/core/providers/cuda/shared_inc/fpgeneric.h:22:8: error: ‘cublasStatus_t’ does not name a type; did you mean ‘cublasStrsv’? 22 | inline cublasStatus_t | ^~~~~~

jcdatin commented 4 days ago

same for Building CXX object CMakeFiles/onnxruntime_framework.dir/onnxruntime/onnxruntime/core/framework/config_options.cc (refers to cublas) and a lot more ...

skottmckay commented 4 hours ago

@gedoensmax is this expected? Not sure if there are other build settings required to use onnxruntime_CUDA_MINIMAL