Open frenetj opened 3 months ago
Could you try building ORT from this branch and see if this could stop from crashing?
Hi Yifan,
Thanks for the quick fix; it works perfectly!
However, while compiling your branch with TensorRT 8.5.3, we got the following errors:
/git/onnxruntime/onnxruntime/core/providers/tensorrt/tensorrt_execution_provider.cc: In member function 'onnxruntime::common::Status onnxruntime::TensorrtExecutionProvider::CreateNodeComputeInfoFromGraph(const onnxruntime::GraphViewer&, const onnxruntime::Node&, std::unordered_map<std::__cxx11::basic_string<char>, long unsigned int>&, std::unordered_map<std::__cxx11::basic_string<char>, long unsigned int>&, std::vector<onnxruntime::NodeComputeInfo>&)': /git/onnxruntime/onnxruntime/core/providers/tensorrt/tensorrt_execution_provider.cc:3055:17: error: 'class nvinfer1::IBuilderConfig' has no member named 'setHardwareCompatibilityLevel' 3055 | trt_config->setHardwareCompatibilityLevel(nvinfer1::HardwareCompatibilityLevel::kAMPERE_PLUS); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /git/onnxruntime/onnxruntime/core/providers/tensorrt/tensorrt_execution_provider.cc:3055:57: error: 'nvinfer1::HardwareCompatibilityLevel' has not been declared 3055 | trt_config->setHardwareCompatibilityLevel(nvinfer1::HardwareCompatibilityLevel::kAMPERE_PLUS); | ^~~~~~~~~~~~~~~~~~~~~~~~~~ /git/onnxruntime/onnxruntime/core/providers/tensorrt/tensorrt_execution_provider.cc: In lambda function: /git/onnxruntime/onnxruntime/core/providers/tensorrt/tensorrt_execution_provider.cc:3644:21: error: 'class nvinfer1::IBuilderConfig' has no member named 'setHardwareCompatibilityLevel' 3644 | trt_config->setHardwareCompatibilityLevel(nvinfer1::HardwareCompatibilityLevel::kAMPERE_PLUS); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /git/onnxruntime/onnxruntime/core/providers/tensorrt/tensorrt_execution_provider.cc:3644:61: error: 'nvinfer1::HardwareCompatibilityLevel' has not been declared 3644 | trt_config->setHardwareCompatibilityLevel(nvinfer1::HardwareCompatibilityLevel::kAMPERE_PLUS); | ^~~~~~~~~~~~~~~~~~~~~~~~~~ gmake[2]: *** [CMakeFiles/onnxruntime_providers_tensorrt.dir/build.make:146: CMakeFiles/onnxruntime_providers_tensorrt.dir/git/onnxruntime/onnxruntime/core/providers/tensorrt/tensorrt_execution_provider.cc.o] Error 1 gmake[1]: *** [CMakeFiles/Makefile2:2267: CMakeFiles/onnxruntime_providers_tensorrt.dir/all] Error 2
that we fixed by adding #if NV_TENSORRT_MAJOR >= 10 when trt_config->setHardwareCompatibilityLevel was called:
diff --git a/onnxruntime/core/providers/tensorrt/tensorrt_execution_provider.cc b/onnxruntime/core/providers/tensorrt/tensorrt_execution_provider.cc index 2df4611743..b1e7147ea1 100644 --- a/onnxruntime/core/providers/tensorrt/tensorrt_execution_provider.cc +++ b/onnxruntime/core/providers/tensorrt/tensorrt_execution_provider.cc @@ -3051,12 +3051,13 @@ Status TensorrtExecutionProvider::CreateNodeComputeInfoFromGraph(const GraphView
+#endif // Name the engine cache based on GPU compute capacity and reduce the chance of loading an incompatible cache // Note: Engine cache generated on a GPU with large memory might not be loadable on a GPU with smaller memory, even if they share the same compute capacity const std::string cache_path_prefix = cache_path + cache_hw_compat; @@ -3639,12 +3640,13 @@ Status TensorrtExecutionProvider::CreateNodeComputeInfoFromGraph(const GraphView } }
+#endif
// Build engine
std::unique_ptr
Would it be possible for you to also make this change?
Note that Git's formatting is not showing the second part of the above comment properly. Please read it in standard text format.
Hi @frenetj ORT starts to support TRT8.6 since 1.15 and add features incompatible to older TRT 8.x. Please find TRT version requirement https://onnxruntime.ai/docs/execution-providers/TensorRT-ExecutionProvider.html We recommend using latest TRT10.x, as ORT will gradually stop supporting TRT8.6 in future
Hello @yf711 Using TRT8.6 works perfectly with this fix. Thanks a lot!
Hello @yf711, the fix doesn't seem to have been integrated in the latest release (1.19.2).
Hi @frenetj thanks for the notice I just found that my fix didn't make it to 1.19, but it will be included in the upcoming 1.20 release, which is targeted early next month. You can also build from branch rel-1.20.0 and see if that works as expected in your case
Describe the issue
When the TensorRT EP fails to create engine from network and the client calls run() again in the same session, the following crash occurs:
`#0 0x00007efc5442df84 in nvinfer1::ICudaEngine::getNbIOTensors() const (this=0x0) at tensort/include/NvInferRuntime.h:2160
1 0x00007efc54451cf8 in onnxruntime::TensorrtExecutionProvider::<lambda(onnxruntime::FunctionState, const OrtApi, OrtKernelContext)>::operator()(onnxruntime::FunctionState, const OrtApi , OrtKernelContext ) const (__closure=0x7efbfb1d8098, state=0x7efbfc81bf80, api=
2 0x00007efc54487e8c in std::_Function_handler<onnxruntime::common::Status(void, const OrtApi, OrtKernelContext), onnxruntime::TensorrtExecutionProvider::CreateNodeComputeInfoFromGraph(const onnxruntime::GraphViewer&, const onnxruntime::Node&, std::unordered_map<std::cxx11::basic_string, long unsigned int>&, std::unordered_map<std:: cxx11::basic_string, long unsigned int>&, std::vector&)::<lambda(onnxruntime::FunctionState, const OrtApi , OrtKernelContext)> >::_M_invoke(const std::_Any_data &, void &&, const OrtApi &&, OrtKernelContext &&)
3 0x00007f02b59addac in std::function<onnxruntime::common::Status (void, OrtApi const, OrtKernelContext)>::operator()(void, OrtApi const, OrtKernelContext) const (this=0x7efbfb1d8098, args#0=0x7efbfc81bf80, args#1=0x7f02b6d0b2e0, __args#2=0x7fff94d9ce50)
4 0x00007f02b59a76b9 in onnxruntime::FunctionKernel::Compute(onnxruntime::OpKernelContext*) const (this=0x7efc014e2c00, context=0x7fff94d9ce50) at onnxruntime-1.18.0/onnxruntime/core/framework/func_kernel.h:52
5 0x00007f02b5ac7d5c in onnxruntime::ExecuteKernel(onnxruntime::StreamExecutionContext&, unsigned long, unsigned long, bool const&, onnxruntime::SessionScope&) (ctx=..., idx=4937, stream_idx=0, terminate_flag=@0x2716f308: false, session_scope=...)
6 0x00007f02b5abef4c in onnxruntime::LaunchKernelStep::Execute(onnxruntime::StreamExecutionContext&, unsigned long, onnxruntime::SessionScope&, bool const&, bool&) (this=0x3587a8e0, ctx=..., stream_idx=0, session_scope=..., terminate_flag=@0x2716f308: false, continue_flag=@0x7fff94d9d51f: true)
7 0x00007f02b5acb5a3 in onnxruntime::RunSince(unsigned long, onnxruntime::StreamExecutionContext&, onnxruntime::SessionScope&, bool const&, unsigned long) (stream_idx=0, ctx=..., session_scope=..., terminate_flag=@0x2716f308: false, since=0)
8 0x00007f02b5ac827b in onnxruntime::<lambda()>::operator()(void) const (__closure=0x7efc017dc3b0) at onnxruntime-1.18.0/onnxruntime/core/framework/sequential_executor.cc:589
9 0x00007f02b5ac992f in std::_Function_handler<void(), onnxruntime::ExecuteThePlan(const onnxruntime::SessionState&, gsl::span, gsl::span, gsl::span, std::vector&, const std::unordered_map<long unsigned int, std::function<onnxruntime::common::Status(const onnxruntime::TensorShape&, const OrtDevice&, OrtValue&, bool&)> >&, const onnxruntime::logging::Logger&, const onnxruntime::DeviceStreamCollection*, bool const&, bool, bool)::<lambda()> >::_M_invoke(const std::_Any_data &) (__functor=...) at /usr/include/c++/8/bits/std_function.h:297
10 0x00007f02b4e39dac in std::function<void ()>::operator()() const (this=0x7fff94d9dbf0) at /usr/include/c++/8/bits/std_function.h:687
11 0x00007f02b4e1ad49 in onnxruntime::concurrency::ThreadPool::Schedule(onnxruntime::concurrency::ThreadPool*, std::function<void ()>) (tp=0x0, fn=...) at onnxruntime-1.18.0/include/onnxruntime/core/platform/threadpool.h:233
12 0x00007f02b5ac8608 in onnxruntime::ExecuteThePlan(onnxruntime::SessionState const&, gsl::span<int const, 18446744073709551615ul>, gsl::span<OrtValue const, 18446744073709551615ul>, gsl::span<int const, 18446744073709551615ul>, std::vector<OrtValue, std::allocator >&, std::unordered_map<unsigned long, std::function<onnxruntime::common::Status (onnxruntime::TensorShape const&, OrtDevice const&, OrtValue&, bool&)>, std::hash, std::equal_to, std::allocator<std::pair<unsigned long const, std::function<onnxruntime::common::Status (onnxruntime::TensorShape const&, OrtDevice const&, OrtValue&, bool&)> > > > const&, onnxruntime::logging::Logger const&, onnxruntime::DeviceStreamCollection const*, bool const&, bool, bool)
13 0x00007f02b5a68157 in onnxruntime::utils::ExecuteGraphImpl(const onnxruntime::SessionState &, const onnxruntime::FeedsFetchesManager &, gsl::span<OrtValue const, 18446744073709551615>, std::vector<OrtValue, std::allocator > &, const std::unordered_map<long unsigned int, std::function<onnxruntime::common::Status(const onnxruntime::TensorShape&, const OrtDevice&, OrtValue&, bool&)>, std::hash, std::equal_to, std::allocator<std::pair<long unsigned int const, std::function<onnxruntime::common::Status(const onnxruntime::TensorShape&, const OrtDevice&, OrtValue&, bool&)> > > > &, ExecutionMode, const bool &, const onnxruntime::logging::Logger &, onnxruntime::DeviceStreamCollection , bool, onnxruntime::Stream )
14 0x00007f02b5a6878e in onnxruntime::utils::ExecuteGraph(onnxruntime::SessionState const&, onnxruntime::FeedsFetchesManager&, gsl::span<OrtValue const, 18446744073709551615ul>, std::vector<OrtValue, std::allocator >&, ExecutionMode, bool const&, onnxruntime::logging::Logger const&, onnxruntime::DeviceStreamCollectionHolder&, bool, onnxruntime::Stream*)
15 0x00007f02b5a68868 in onnxruntime::utils::ExecuteGraph(onnxruntime::SessionState const&, onnxruntime::FeedsFetchesManager&, gsl::span<OrtValue const, 18446744073709551615ul>, std::vector<OrtValue, std::allocator >&, ExecutionMode, OrtRunOptions const&, onnxruntime::DeviceStreamCollectionHolder&, onnxruntime::logging::Logger const&) (session_state=..., feeds_fetches_manager=..., feeds=..., fetches=std::vector of length 2, capacity 2 = {...}, execution_mode=ORT_SEQUENTIAL, run_options=..., device_stream_collection_holder=..., logger=...)
16 0x00007f02b4e33fd5 in onnxruntime::InferenceSession::Run(OrtRunOptions const&, gsl::span<std::cxx11::basic_string<char, std::char_traits, std::allocator > const, 18446744073709551615ul>, gsl::span<OrtValue const, 18446744073709551615ul>, gsl::span<std:: cxx11::basic_string<char, std::char_traits, std::allocator > const, 18446744073709551615ul>, std::vector<OrtValue, std::allocator >, std::vector<OrtDevice, std::allocator > const )
17 0x00007f02b4e351bc in onnxruntime::InferenceSession::Run(OrtRunOptions const&, gsl::span<char const const, 18446744073709551615ul>, gsl::span<OrtValue const const, 18446744073709551615ul>, gsl::span<char const const, 18446744073709551615ul>, gsl::span<OrtValue, 18446744073709551615ul>)
18 0x00007f02b4d42116 in OrtApis::Run(OrtSession, OrtRunOptions const, char const const, OrtValue const const, unsigned long, char const const, unsigned long, OrtValue**)
To reproduce
Run inference on a model that is too large to be cached (or force return of the following error "TensorRT EP failed to create engine from network." in the TensorRT EP. Try running the inference again on the same session. --> crash
Urgency
No response
Platform
Linux
OS Version
ROCKY 8.5 (gcc-11.2.1, c++17)
ONNX Runtime Installation
Built from Source
ONNX Runtime Version or Commit ID
1.18.0
ONNX Runtime API
C
Architecture
X64
Execution Provider
TensorRT
Execution Provider Library Version
CUDA 11.8