Closed laxnpander closed 4 months ago
Hey,
cross reference to my issue on onnxruntime: https://github.com/microsoft/onnxruntime/issues/19076
Running my inference on the model at hand (superpoint_lightglue_end2end) I receive a CUDA memcpy exception. My CUDA version is 11.8 with cuDNN 8.9.
Verbose log: verbose_log.txt
According to the verbose log it always seems to happen at Kernel with idx 2478. Anyone experienced something similar? What could be the cause?
<unknown> 0x00007fffd9970935 <unknown> 0x00007fffd9a5d86a <unknown> 0x00007fffd9b914cb <unknown> 0x00007fffd9b91d61 <unknown> 0x00007fffd9cb9130 <unknown> 0x00007fffd9931a33 <unknown> 0x00007fffd9931f41 <unknown> 0x00007fffd9932ea8 <unknown> 0x00007fffd9b000d1 <unknown> 0x00007fffdb644459 <unknown> 0x00007fffdb6176fd cudaMemcpyAsync 0x00007fffdb6696a5 onnxruntime::GPUDataTransfer::CopyTensorAsync(onnxruntime::Tensor const&, onnxruntime::Tensor&, onnxruntime::Stream&) const 0x00007fff9fd1b0dd onnxruntime::IDataTransfer::CopyTensors(std::vector<onnxruntime::IDataTransfer::SrcDstPair, std::allocator<onnxruntime::IDataTransfer::SrcDstPair> > const&) const 0x00007ffff6dbbe63 onnxruntime::ProviderHostImpl::IDataTransfer__CopyTensors(onnxruntime::IDataTransfer const*, std::vector<onnxruntime::IDataTransfer::SrcDstPair, std::allocator<onnxruntime::IDataTransfer::SrcDstPair> > const&) 0x00007ffff66406a8 onnxruntime::IDataTransfer::CopyTensors(std::vector<onnxruntime::IDataTransfer::SrcDstPair, std::allocator<onnxruntime::IDataTransfer::SrcDstPair> > const&) const 0x00007fff9ff35bc7 onnxruntime::DataTransferManager::CopyTensors(std::vector<onnxruntime::IDataTransfer::SrcDstPair, std::allocator<onnxruntime::IDataTransfer::SrcDstPair> > const&) const 0x00007ffff6dbf95d onnxruntime::utils::ExecuteGraphImpl(onnxruntime::SessionState const&, onnxruntime::FeedsFetchesManager const&, gsl::span<OrtValue const, 18446744073709551615ul>, std::vector<OrtValue, std::allocator<OrtValue> >&, std::unordered_map<unsigned long, std::function<onnxruntime::common::Status (onnxruntime::TensorShape const&, OrtDevice const&, OrtValue&, bool&)>, std::hash<unsigned long>, std::equal_to<unsigned long>, std::allocator<std::pair<unsigned long const, std::function<onnxruntime::common::Status (onnxruntime::TensorShape const&, OrtDevice const&, OrtValue&, bool&)> > > > const&, ExecutionMode, bool const&, onnxruntime::logging::Logger const&, onnxruntime::DeviceStreamCollection*, bool, onnxruntime::Stream*) 0x00007ffff6e65802 onnxruntime::utils::ExecuteGraph(onnxruntime::SessionState const&, onnxruntime::FeedsFetchesManager&, gsl::span<OrtValue const, 18446744073709551615ul>, std::vector<OrtValue, std::allocator<OrtValue> >&, ExecutionMode, bool const&, onnxruntime::logging::Logger const&, onnxruntime::DeviceStreamCollectionHolder&, bool, onnxruntime::Stream*) 0x00007ffff6e66e8b onnxruntime::utils::ExecuteGraph(onnxruntime::SessionState const&, onnxruntime::FeedsFetchesManager&, gsl::span<OrtValue const, 18446744073709551615ul>, std::vector<OrtValue, std::allocator<OrtValue> >&, ExecutionMode, OrtRunOptions const&, onnxruntime::DeviceStreamCollectionHolder&, onnxruntime::logging::Logger const&) 0x00007ffff6e671f3 onnxruntime::InferenceSession::Run(OrtRunOptions const&, gsl::span<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, 18446744073709551615ul>, gsl::span<OrtValue const, 18446744073709551615ul>, gsl::span<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, 18446744073709551615ul>, std::vector<OrtValue, std::allocator<OrtValue> >*, std::vector<OrtDevice, std::allocator<OrtDevice> > const*) [clone .localalias] 0x00007ffff668ac8a onnxruntime::InferenceSession::Run(OrtRunOptions const&, gsl::span<char const* const, 18446744073709551615ul>, gsl::span<OrtValue const* const, 18446744073709551615ul>, gsl::span<char const* const, 18446744073709551615ul>, gsl::span<OrtValue*, 18446744073709551615ul>) 0x00007ffff668bab2 OrtApis::Run(OrtSession*, OrtRunOptions const*, char const* const*, OrtValue const* const*, unsigned long, char const* const*, unsigned long, OrtValue**) 0x00007ffff6613fff Ort::detail::SessionImpl::Run onnxruntime_cxx_inline.h:967 spear::ort::Inference::run Inference.h:314 main superpoint_lightglue_main.cpp:67 __libc_start_call_main 0x00007ffff5c29d90 __libc_start_main_impl 0x00007ffff5c29e40 _start 0x0000555555558d55
Released improved end-to-end models
Hey,
cross reference to my issue on onnxruntime: https://github.com/microsoft/onnxruntime/issues/19076
Running my inference on the model at hand (superpoint_lightglue_end2end) I receive a CUDA memcpy exception. My CUDA version is 11.8 with cuDNN 8.9.
Verbose log: verbose_log.txt
According to the verbose log it always seems to happen at Kernel with idx 2478. Anyone experienced something similar? What could be the cause?