Open bfaccini opened 1 year ago
Let me know if you agree with my analysis and if you want me to push a PR (will need some guidance about usual procedure to follow) ?
@bfaccini the proposed solution is good enough, though a better way would probably be to use backtrace_fd Pls see https://github.com/openucx/ucx/wiki/Guidance-for-contributors
though a better way would probably be to use backtrace_fd
backtrace_fd() ?? you mean backtrace_symbols_fd() ? if yes, I am not sure because the suspected ENOMEM should have occurred in ucs_debug_backtrace_create() I believe.
Ok. so the proposed solution seems good to me.
Describe the bug
Application has crashed/SEGV (apparently also due to wrong/no handling of ENOMEM/failed allocation), then UCX signal-handler/stack-unwinder also has crashed/SEGV (again due to wrong/no handling of ENOMEM/failed allocation) with the following stack :
Looks like the same error handling path than for ucs_debug_backtrace_create() return in ucs_log_print_backtrace() must be done in ucs_debug_print_backtrace(), like with the following changes :
Steps to Reproduce
ucx_info -v
)