Open jcdatin opened 5 days ago
Add this to your build command line --cmake_extra_defines onnxruntime_CUDA_MINIMAL=ON
thank you ! (I could not find it in build.py) Trying this.
unfortunately this does not work , there is still cuda dependencies that caused compilation errors
[ 35%] Building CXX object CMakeFiles/onnxruntime_providers_cuda.dir/onnxruntime/onnxruntime/contrib_ops/cuda/bert/decoder_attention.cc.o
In file included from /onnxruntime/onnxruntime/contrib_ops/cuda/bert/attention.cc:5:
/onnxruntime/onnxruntime/core/providers/cuda/shared_inc/fpgeneric.h:22:8: error: ‘cublasStatus_t’ does not name a type; did you mean ‘cublasStrsv’?
22 | inline cublasStatus_t
| ^~~~~~
same for Building CXX object CMakeFiles/onnxruntime_framework.dir/onnxruntime/onnxruntime/core/framework/config_options.cc (refers to cublas) and a lot more ...
@gedoensmax is this expected? Not sure if there are other build settings required to use onnxruntime_CUDA_MINIMAL
Describe the issue
There must be a way to build onnxruntime with tensorRt without the cuda execution provider and its cuda unused dependencies. libonnxruntime_providers_cuda.so is big (220MB) and is dragging other big dependencies like libcufft or libcublas that we don't use in inference (another 400MB).
Urgency
non blocking
Target platform
linux
Build script
build.py
Error / output
N/A
Visual Studio Version
N/A
GCC / Compiler Version
gcc11