Closed Karsten1987 closed 3 years ago
@elfenpiff Did you have a look on this segfault already?
@Karsten1987 Did you check again after the iceoryx release in August?
I just did and I still run into a segmentation fault:
➭ lldb ~/workspace/ros2/ros2_master/install/lib/demo_nodes_cpp/talker
(lldb) target create "/Users/karsten/workspace/ros2/ros2_master/install/lib/demo_nodes_cpp/talker"
eCurrent executable set to '/Users/karsten/workspace/ros2/ros2_master/install/lib/demo_nodes_cpp/talker' (x86_64).
(lldb) env RMW_IMPLEMENTATION=rmw_iceoryx_cpp
(lldb) r
Process 88808 launched: '/Users/karsten/workspace/ros2/ros2_master/install/lib/demo_nodes_cpp/talker' (x86_64)
1970-01-05 16:34:33.419 [Warning]: MQ still there, doing an unlink of /talker_88808
socket: "/tmp//roudi", timedSend with a timeout != 0 is not supported on MacOS. timedSend will behave like send instead.
1970-01-05 16:34:33.421 [ Debug ]: Application registered management segment 0x110000000 with size 82677696 to id 1
1970-01-05 16:34:33.424 [ Info ]: Application registered payload segment 0x114ed9000 with size 149655680 to id 2
Process 88808 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x7ff31bd00610)
frame #0: 0x00000001028871ea librmw_iceoryx_name_conversion.dylib`void std::__1::__cxx_atomic_store<int>(std::__1::__cxx_atomic_base_impl<int>*, int, std::__1::memory_order) + 74
librmw_iceoryx_name_conversion.dylib`std::__1::__cxx_atomic_store<int>:
-> 0x1028871ea <+74>: movl %eax, (%rcx)
0x1028871ec <+76>: jmp 0x102887208 ; <+104>
0x1028871f1 <+81>: movl -0x14(%rbp), %eax
0x1028871f4 <+84>: movq -0x20(%rbp), %rcx
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x7ff31bd00610)
* frame #0: 0x00000001028871ea librmw_iceoryx_name_conversion.dylib`void std::__1::__cxx_atomic_store<int>(std::__1::__cxx_atomic_base_impl<int>*, int, std::__1::memory_order) + 74
frame #1: 0x0000000102886b64 librmw_iceoryx_name_conversion.dylib`std::__1::__atomic_base<int, false>::store(int, std::__1::memory_order) + 36
frame #2: 0x0000000102886ae0 librmw_iceoryx_name_conversion.dylib`iox_sem_init(iox_sem_t*, int, unsigned int) + 64
frame #3: 0x0000000102859852 librmw_iceoryx_name_conversion.dylib`iox::cxx::SmartC<int (iox_sem_t*, int, unsigned int), int, iox_sem_t*, int, unsigned int>::SmartC(char const*, int, char const*, int (&)(iox_sem_t*, int, unsigned int), iox::cxx::ReturnMode const&, std::initializer_list<int> const&, std::initializer_list<int> const&, iox_sem_t*, int, unsigned int) + 162
frame #4: 0x00000001028596cd librmw_iceoryx_name_conversion.dylib`iox::cxx::SmartC<int (iox_sem_t*, int, unsigned int), int, iox_sem_t*, int, unsigned int>::SmartC(char const*, int, char const*, int (&)(iox_sem_t*, int, unsigned int), iox::cxx::ReturnMode const&, std::initializer_list<int> const&, std::initializer_list<int> const&, iox_sem_t*, int, unsigned int) + 157
frame #5: 0x0000000102855143 librmw_iceoryx_name_conversion.dylib`iox::cxx::SmartC<int (iox_sem_t*, int, unsigned int), int, iox_sem_t*, int, unsigned int> iox::cxx::makeSmartCImpl<int (iox_sem_t*, int, unsigned int), int, iox_sem_t*, int, unsigned int>(char const*, int, char const*, int const(&)(iox_sem_t*, int, unsigned int), iox::cxx::ReturnMode const&, std::initializer_list<int> const&, std::initializer_list<int> const&, iox_sem_t*, int, unsigned int) + 403
frame #6: 0x0000000102854103 librmw_iceoryx_name_conversion.dylib`iox::posix::Semaphore::init(iox_sem_t*, int, unsigned int) + 243
frame #7: 0x0000000102854399 librmw_iceoryx_name_conversion.dylib`iox::posix::Semaphore::Semaphore(iox::posix::CreateUnnamedSharedMemorySemaphore_t, iox_sem_t*, unsigned int) + 201
frame #8: 0x0000000102854413 librmw_iceoryx_name_conversion.dylib`iox::posix::Semaphore::Semaphore(iox::posix::CreateUnnamedSharedMemorySemaphore_t, iox_sem_t*, unsigned int) + 35
frame #9: 0x00000001028266a6 librmw_iceoryx_name_conversion.dylib`iox::cxx::expected<iox::posix::Semaphore, iox::posix::SemaphoreError> DesignPattern::Creation<iox::posix::Semaphore, iox::posix::SemaphoreError>::create<iox::posix::CreateUnnamedSharedMemorySemaphore_t const&, iox_sem_t*, int>(iox::posix::CreateUnnamedSharedMemorySemaphore_t const&, iox_sem_t*&&, int&&) + 150
frame #10: 0x0000000102826515 librmw_iceoryx_name_conversion.dylib`iox::popo::ReceiverPort::GetShmSemaphore() + 149
frame #11: 0x0000000101f1f1be librmw_iceoryx_cpp.dylib`iox::popo::Subscriber_t<iox::popo::ReceiverPort>::getSemaphore() const + 30
frame #12: 0x0000000101f1eda6 librmw_iceoryx_cpp.dylib`rmw_create_wait_set + 902
frame #13: 0x000000010139d2d5 librmw_implementation.dylib`::rmw_create_wait_set(v2=0x0000000102a00570, v1=2) at functions.cpp:483:1
frame #14: 0x000000010137125a librcl.dylib`rcl_wait_set_init(wait_set=0x00007ffeefbfd9d8, number_of_subscriptions=0, number_of_guard_conditions=2, number_of_timers=0, number_of_clients=0, number_of_services=0, number_of_events=0, context=0x0000000102a00190, allocator=rcl_allocator_t @ 0x00007ffeefbfa950) at wait.c:163:34
frame #15: 0x0000000100334a27 librclcpp.dylib`rclcpp::Executor::Executor(this=0x00007ffeefbfd9b8, options=0x00007ffeefbfc610) at executor.cpp:66:9
frame #16: 0x000000010035ece7 librclcpp.dylib`rclcpp::executors::SingleThreadedExecutor::SingleThreadedExecutor(this=0x00007ffeefbfd9b8, options=0x00007ffeefbfc610) at single_threaded_executor.cpp:22:3
frame #17: 0x000000010035ed2d librclcpp.dylib`rclcpp::executors::SingleThreadedExecutor::SingleThreadedExecutor(this=0x00007ffeefbfd9b8, options=0x00007ffeefbfc610) at single_threaded_executor.cpp:22:29
frame #18: 0x00000001000106b0 talker`main(argc=1, argv=0x00007ffeefbfdba8) at node_main_talker.cpp:30:45
frame #19: 0x00007fff663303d5 libdyld.dylib`start + 1
frame #20: 0x00007fff663303d5 libdyld.dylib`start + 1
I can't tell if that's due to an outdated RMW though.
As a sidenote, my console output gets flooded with message in the style of:
[ 2 ] No such file or directory
/Users/karsten/workspace/ros2/iceoryx_ws/osx/src/iceoryx/iceoryx_utils/source/posix_wrapper/unix_domain_socket.cpp:379 { cxx::expected<int32_t, IpcChannelError> iox::posix::UnixDomainSocket::createSocket(const iox::posix::IpcChannelMode) } :::
The reason being that iox-roudi
was not started. There was a warning message being printed at the beginning, but due to the constant console output, it was almost impossible to notice.
I'm setting up my Macbook this week. Please be patient, I'll have a look!
@Karsten1987 So, to reproduce it I follow the steps on rmw_iceoryx and then start a ROS2 app? If you can give me some details I can start working on this issue!
With the upcoming iceoryx 0.9 release we break the API and have to adapt rmw_iceoryx. I.e. this error can only be analyzed with iceoryx 0.17.0. My proposal would be to not officially support Mac with rmw_iceoryx based on 0..17.0. We have to ensure that iceoryx is running fine on Mac with the 1.0 release, then we can implement the next generation of the rmw and check that also Mac is working
Fine for all? @Karsten1987 @marthtz
@marthtz. Please check the upcoming 0.9 on your Mac
It looks like it never worked. Issue is that unnamed shared semaphore is not working. Shared mem semaphores created in RouDi can't be used in apps in the current implementation. That's also visible in the backtrace above. Problem is that there don't seem to be any tests checking shared semaphore usage across multiple processes.
@dkroenke @budrus
We definitely need real inter process tests otherwise we will run again in such an issue - the roudi environment will never catch such issues. At the moment the waitset is NOT working on Mac OS and this needs to be fixed by fixing the mac os semaphore platform implementation. This should be not so time consuming but at the moment I am unable to implement this.
@Karsten1987 @marthtz I think this issue can be closed and it is fixed in master. Could you please verify?
Do you have instructions for this? The current rmw_iceoryx master doesn't build with iceoryx. I don't really know how to run the ROS2 demo without the rmw.
@Karsten1987 I created two issues in https://github.com/ros2/rmw_iceoryx/issues/41 and https://github.com/ros2/rmw_iceoryx/issues/40 . Additionally, I created a support issue on our side to track those two https://github.com/eclipse-iceoryx/iceoryx/issues/511. Therefore, I would close this issue for now and would continue working on rmw_iceoryx in the other one.
The reason is that I am pretty sure that there was an issue with the shared memory semaphore which caused the bug but it should be fixed now. But we have no way to be sure when rmw_iceoryx is not even building with iceoryx anymore.
Required information
Operating system: OSX
Compiler version: Apple LLVM version 10.0.1 (clang-1001.0.46.4)
Observed result or behavior: segmentation fault
bt:
Expected result or behaviour:
RMW_IMPLEMENTATION=rmw_iceoryx_cpp ros2 run demo_nodes_cpp talker
to run successfully.