microsoft / onnxruntime

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
https://onnxruntime.ai
MIT License
14.06k stars 2.83k forks source link

Program received signal SIGSEGV, Segmentation fault when I Custruct Ort::Session #8707

Open liym27 opened 3 years ago

liym27 commented 3 years ago

Describe the bug

Run code:

// ...
// something else
// ...

Ort::Env env(ORT_LOGGING_LEVEL_WARNING, "test");
Ort::SessionOptions session_options;
string model_path = model_dir + "/" + model_meta.name();
Ort::Session my_session(env, model_path.c_str(), session_options);

// ...
// something else
// ...

Segmentation fault c++ stack

Program received signal SIGSEGV, Segmentation fault.
tcache_dalloc_large (slow_path=false, binind=0, ptr=0x7fffffffced8, tcache=0x7ffff7e93980, tsd=0x7ffff7e937c0) at include/jemalloc/internal/tcache_inlines.h:203
203     include/jemalloc/internal/tcache_inlines.h: No such file or directory.
Missing separate debuginfos, use: debuginfo-install glibc-2.17-196.tl2.3.x86_64 libaio-0.3.109-13.el7.x86_64 zlib-1.2.7-15.el7.x86_64
(gdb) bt
#0  tcache_dalloc_large (slow_path=false, binind=0, ptr=0x7fffffffced8, tcache=0x7ffff7e93980, tsd=0x7ffff7e937c0) at include/jemalloc/internal/tcache_inlines.h:203
#1  arena_dalloc (slow_path=false, alloc_ctx=<synthetic pointer>, tcache=0x7ffff7e93980, ptr=0x7fffffffced8, tsdn=0x7ffff7e937c0)
    at include/jemalloc/internal/arena_inlines_b.h:232
#2  idalloctm (slow_path=false, is_internal=false, alloc_ctx=<synthetic pointer>, tcache=0x7ffff7e93980, ptr=0x7fffffffced8, tsdn=0x7ffff7e937c0)
    at include/jemalloc/internal/jemalloc_internal_inlines_c.h:118
#3  ifree (slow_path=false, tcache=0x7ffff7e93980, ptr=0x7fffffffced8, tsd=0x7ffff7e937c0) at src/jemalloc.c:2223
#4  free (ptr=0x7fffffffced8) at src/jemalloc.c:2394
#5  0x000000000620e74e in __gnu_cxx::new_allocator<std::__detail::_Hash_node_base*>::deallocate(std::__detail::_Hash_node_base**, unsigned long) ()
#6  0x000000000f76dbe2 in std::_Hashtable<std::string, std::pair<std::string const, int>, std::allocator<std::pair<std::string const, int> >, std::__detail::_Select1st, std::equal_to<std::string>, std::hash<std::string>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, false, true> >::_M_deallocate_buckets(std::__detail::_Hash_node_base**, unsigned long) ()
#7  0x000000000f8d9022 in std::_Hashtable<std::string, std::pair<std::string const, int>, std::allocator<std::pair<std::string const, int> >, std::__detail::_Select1st, std::equal_to<std::string>, std::hash<std::string>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, false, true> >::_M_deallocate_buckets() ()
#8  0x000000000f912aa3 in std::_Hashtable<std::string, std::pair<std::string const, int>, std::allocator<std::pair<std::string const, int> >, std::__detail::_Select1st, std::equal_to<std::string>, std::hash<std::string>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, false, true> >::_M_move_assign(std::_Hashtable<std::string, std::pair<std::string const, int>, std::allocator<std::pair<std::string const, int> >, std::__detail::_Select1st, std::equal_to<std::string>, std::hash<std::string>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, false, true> >&&, std::integral_constant<bool, true>) ()
#9  0x000000000f907b3a in std::_Hashtable<std::string, std::pair<std::string const, int>, std::allocator<std::pair<std::string const, int> >, std::__detail::_Select1st, std::equal_to<std::string>, std::hash<std::string>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, false, true> >::operator=(std::_Hashtable<std::string, std::pair<std::string const, int>, std::allocator<std::pair<std::string const, int> >, std::__detail::_Select1st, std::equal_to<std::string>, std::hash<std::string>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, false, true> >&&) ()
#10 0x000000000f9047fb in std::unordered_map<std::string, int, std::hash<std::string>, std::equal_to<std::string>, std::allocator<std::pair<std::string const, int> > >::operator=(std::unordered_map<std::string, int, std::hash<std::string>, std::equal_to<std::string>, std::allocator<std::pair<std::string const, int> > >&&) ()
#11 0x000000000f904834 in onnx::checker::CheckerContext::set_opset_imports(std::unordered_map<std::string, int, std::hash<std::string>, std::equal_to<std::string>, std::allocator<std::pair<std::string const, int> > >) ()
#12 0x000000000f8f3ae1 in onnxruntime::Graph::VerifyNodeAndOpMatch(onnxruntime::Graph::ResolveOptions const&) ()
#13 0x000000000f8f4ef8 in onnxruntime::Graph::PerformTypeAndShapeInferencing(onnxruntime::Graph::ResolveOptions const&) ()
#14 0x000000000f8f58eb in onnxruntime::Graph::Resolve(onnxruntime::Graph::ResolveOptions const&) ()
#15 0x000000000f94eb1d in onnxruntime::Model::Load(int, std::string const&, std::shared_ptr<onnxruntime::Model>&, std::list<std::shared_ptr<onnxruntime::IOnnxRuntimeOpSchemaCollection>, std::allocator<std::shared_ptr<onnxruntime::IOnnxRuntimeOpSchemaCollection> > > const*, onnxruntime::logging::Logger const&) ()
#16 0x000000000f94f384 in onnxruntime::common::Status onnxruntime::LoadModel<std::string>(std::string const&, std::shared_ptr<onnxruntime::Model>&, std::list<std::shared_ptr<onnxruntime::IOnnxRuntimeOpSchemaCollection>, std::allocator<std::shared_ptr<onnxruntime::IOnnxRuntimeOpSchemaCollection> > > const*, onnxruntime::logging::Logger const&)::{lambda(int)#1}::operator()(int) const ()
#17 0x000000000f94fc32 in onnxruntime::common::Status onnxruntime::LoadModelHelper<std::string, onnxruntime::common::Status onnxruntime::LoadModel<std::string>(std::string const&, std::shared_ptr<onnxruntime::Model>&, std::list<std::shared_ptr<onnxruntime::IOnnxRuntimeOpSchemaCollection>, std::allocator<std::shared_ptr<onnxruntime::IOnnxRuntimeOpSchemaCollection> > > const*, onnxruntime::logging::Logger const&)::{lambda(int)#1}>(std::string const&, onnxruntime::common::Status onnxruntime::LoadModel<std::string>(std::string const&, std::shared_ptr<onnxruntime::Model>&, std::list<std::shared_ptr<onnxruntime::IOnnxRuntimeOpSchemaCollection>, std::allocator<std::shared_ptr<onnxruntime::IOnnxRuntimeOpSchemaCollection> > > const*, onnxruntime::logging::Logger const&)::{lambda(int)#1}) ()
#18 0x000000000f94f416 in onnxruntime::common::Status onnxruntime::LoadModel<std::string>(std::string const&, std::shared_ptr<onnxruntime::Model>&, std::list<std::shared_ptr<onnxruntime::IOnnxRuntimeOpSchemaCollection>, std::allocator<std::shared_ptr<onnxruntime::IOnnxRuntimeOpSchemaCollection> > > const*, onnxruntime::logging::Logger const&) ()
#19 0x000000000f94e3ed in onnxruntime::Model::Load(std::string const&, std::shared_ptr<onnxruntime::Model>&, std::list<std::shared_ptr<onnxruntime::IOnnxRuntimeOpSchemaCollection>, std::allocator<std::shared_ptr<onnxruntime::IOnnxRuntimeOpSchemaCollection> > > const*, onnxruntime::logging::Logger const&) ()
#20 0x000000000f9bed6e in onnxruntime::common::Status onnxruntime::InferenceSession::Load<char>(std::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)::{lambda(std::shared_ptr<onnxruntime::Model>&)#1}::operator()(std::shared_ptr<onnxruntime::Model>&) const ()
#21 0x000000000f9c67a9 in std::_Function_handler<onnxruntime::common::Status (std::shared_ptr<onnxruntime::Model>&), onnxruntime::common::Status onnxruntime::InferenceSession::Load<char>(std::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)::{lambda(std::shared_ptr<onnxruntime::Model>&)#1}>::_M_invoke(std::_Any_data const&, std::shared_ptr<onnxruntime::Model>&) ()
#22 0x000000000f9bec11 in std::function<onnxruntime::common::Status (std::shared_ptr<onnxruntime::Model>&)>::operator()(std::shared_ptr<onnxruntime::Model>&) const
    ()
---Type <return> to continue, or q <return> to quit---
#23 0x000000000f9ae3f8 in onnxruntime::InferenceSession::Load(std::function<onnxruntime::common::Status (std::shared_ptr<onnxruntime::Model>&)>, std::string const&)
    ()
#24 0x000000000f9bee60 in onnxruntime::common::Status onnxruntime::InferenceSession::Load<char>(std::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) ()
#25 0x000000000f9aeb35 in onnxruntime::InferenceSession::Load(std::string const&) ()
#26 0x000000000f9eb604 in (anonymous namespace)::CreateSessionAndLoadModel(OrtSessionOptions const*, OrtEnv const*, char const*, void const*, unsigned long, std::unique_ptr<onnxruntime::InferenceSession, std::default_delete<onnxruntime::InferenceSession> >&) ()
#27 0x000000000f9ebbba in OrtApis::CreateSession(OrtEnv const*, char const*, OrtSessionOptions const*, OrtSession**) ()
#28 0x0000000006200e45 in Ort::Session::Session(Ort::Env&, char const*, Ort::SessionOptions const&) ()

Urgency If there are particular important use cases blocked by this or strict project-related timelines, please share more information and dates. If there are no hard deadlines, please specify none.

System information

OS: Tencent tlinux 2.2 (Final) (x86_64)
GCC version: (GCC) 7.3.1 20180303 (Red Hat 7.3.1-5)
Clang version: Could not collect
CMake version: version 2.8.12.2
Libc version: glibc-2.17

Python version: 3.6.13 |Anaconda, Inc.| [GCC 7.5.0] (64-bit runtime)
Python platform: Linux-3.10.107-1-tlinux2-0050-x86_64-with-centos-7.2-Final
Is CUDA available: False
CUDA runtime version: No CUDA
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA
HIP runtime version: N/A
MIOpen runtime version: N/A

Versions of relevant libraries:
[pip3] numpy==1.19.4
[pip3] torch==1.9.0
[pip3] torchtext==0.10.0
[conda] Could not collect

To Reproduce

Expected behavior A clear and concise description of what you expected to happen.

Screenshots If applicable, add screenshots to help explain your problem.

Additional context Add any other context about the problem here. If the issue is about a particular model, please share the model details as well to facilitate debugging.

YUNQIUGUO commented 3 years ago

Can you maybe set the logging level to be verbose and see more info? Also if possible, could you please share the model you are using?

yuslepukhin commented 3 years ago

What is your onnxruntime version?

snnn commented 3 years ago

Why jemalloc shows up here?

snnn commented 3 years ago

Please tell us what you changed. The call stack is unusual.

liym27 commented 3 years ago

What is your onnxruntime version?

1.5.1

liym27 commented 3 years ago

Why jemalloc shows up here?

Can you maybe set the logging level to be verbose and see more info? Also if possible, could you please share the model you are using?

ok

snnn commented 3 years ago

What is your onnxruntime version?

1.5.1

Does the latest onnxruntime release have the same issue?

liym27 commented 3 years ago

What is your onnxruntime version?

1.5.1

Does the latest onnxruntime release have the same issue?

Thanks. But I don't use the latest onnxruntime release, and now a new segmentation fault when Ort::Session::Run()

Li-chunming commented 2 years ago

Can you maybe set the logging level to be verbose and see more info? Also if possible, could you please share the model you are using?

What is your onnxruntime version?

1.5.1

Does the latest onnxruntime release have the same issue?

9335 I have met a similar error with the new version 1.9.0.

stale[bot] commented 2 years ago

This issue has been automatically marked as stale due to inactivity and will be closed in 7 days if no further activity occurs. If further support is needed, please provide an update and/or more details.