Closed Alexma3312 closed 6 years ago
I'd assume this is caused by libnabo. If you don't need maximal speed you could try to compile libnabo without gomp and try again. There is a cmake flag making this easy : https://github.com/ethz-asl/libnabo/blob/master/CMakeLists.txt#L89 .
Thank you for the respond, however, I changed the variable in the flag to false and turned off the USE_OPEN_MP, the process still died, with a new issue
terminate called after throwing an instance of 'boost::exception_detail::clone_impl<boost::exception_detail::error_info_injector
Can you give me some help on this?
Hm, this sounds like too many threads get created in your executable and libnabo is not actually causing the issue. But for debugging I'd still keeping OPEN_MP deactivated for libnabo.
Can you get an stacktrace for that exception? Then we have a new candidate for who is missing to reuse or close threads. I just start to remember that there is a known bug in this library missing to join threads. Apparently the fix is not in this branch. @kruesip , do you remember this (boost threads not cleaning up on themselves)? And were you've put the solution?
I vaguely remember having had this problem, but unfortunately forgot how and where we solved it. All I know is that I was using the branch indigo_devel (not reintegrate/master_into_indigo_devel). But I haven't used the library for quite a while, so don't know if that is still working.
Thanks @kruesip ! But I could not find any existing solution.
I believe I found the bug in these two lines: https://github.com/ethz-asl/ethzasl_icp_mapping/blob/877a67471ab0b0d91b6a40e923a11e25c5306259/ethzasl_icp_mapper/src/mapper.cpp#L480 https://github.com/ethz-asl/ethzasl_icp_mapping/blob/877a67471ab0b0d91b6a40e923a11e25c5306259/ethzasl_icp_mapper/src/dynamic_mapper.cpp#L538
It is quite possible that these lines create more and more detached threads till you hit the maximum.
Right now I don't have the resources to fix this. If @Alexma3312 you know how to fix this (you have to join the threads before forgetting about them by assigning a new thread; alternatively one could recycle a single thread over and over -> worker) and have the time to do it, I would be happy to review a PR.
Thank you, I hope to fix this problem. I understand your suggestion of the solution, however, will it be easier to set the maximum of threads as infinite?
On the other hand, will it help if I could provide the stacktrace? How can I provide the stacktrace?
I don't think the stack trace will provide new insights. Infinite threads are not possible. A thread is consuming resources from the kernel (even it its entry function returned). Of course you might be able to solve your issue by raising it enough (https://stackoverflow.com/a/344292), though.
Linking with issue #64, I think GOMP is not responsible here, but just happens to fail because the number of non-joined threads created by them main program is going to infinity. I'd mark it as resolved once #64 is implemented in mapper and dynamic_mapper.
This should be solved with PR #65. Please reopen if not.
Hi,
The process died by itself(exit code 1)after I ran the Kingfisher.launch about 30mins. And the reason is caused by this thread creation failed. I tried to relaunch the launch file, the launch file continuous to work. But after about 30mins, the process died with the same reason.
I am in reintegrate/master_into_indigo_devel branch, and the launch file path is ethzasl_icp_mapping/ethzasl_icp_mapper/launch/kingfisher/kingfisher.launch. How should I solve this problem? thx!