eclipse-iceoryx / iceoryx

Eclipse iceoryx™ - true zero-copy inter-process-communication
https://iceoryx.io
Apache License 2.0
1.65k stars 385 forks source link

Unable to run on ROS2 demo application on OSX #204

Closed Karsten1987 closed 3 years ago

Karsten1987 commented 4 years ago

Required information

Operating system: OSX

Compiler version: Apple LLVM version 10.0.1 (clang-1001.0.46.4)

Observed result or behavior: segmentation fault

1970-01-13 04:30:05.895 [ Debug ]: Application registered management segment 0x103000000 with size 71578784 to id 1
1970-01-13 04:30:05.897 [ Info  ]: Application registered payload segment 0x110000000 with size 149655680 to id 2
Process 74700 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x7ffc1be007c0)
    frame #0: 0x000000010209af90 librmw_iceoryx_name_conversion.dylib`std::__1::__atomic_base<int, false>::store(this=0x00007ffc1be007c0, __d=0, __m=memory_order_relaxed) at atomic:921:10
   918      _LIBCPP_INLINE_VISIBILITY
   919      void store(_Tp __d, memory_order __m = memory_order_seq_cst) _NOEXCEPT
   920        _LIBCPP_CHECK_STORE_MEMORY_ORDER(__m)
-> 921          {__c11_atomic_store(&__a_, __d, __m);}
   922      _LIBCPP_INLINE_VISIBILITY
   923      _Tp load(memory_order __m = memory_order_seq_cst) const volatile _NOEXCEPT
   924        _LIBCPP_CHECK_LOAD_MEMORY_ORDER(__m)
Target 0: (talker) stopped.

bt:

(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x7ffc1be007c0)
  * frame #0: 0x000000010209af90 librmw_iceoryx_name_conversion.dylib`std::__1::__atomic_base<int, false>::store(this=0x00007ffc1be007c0, __d=0, __m=memory_order_relaxed) at atomic:921:10
    frame #1: 0x000000010209aef0 librmw_iceoryx_name_conversion.dylib`iox_sem_init(sem=0x0000000106365710, (null)=1, value=0) at semaphore.cpp:222:19
    frame #2: 0x00000001020683b2 librmw_iceoryx_name_conversion.dylib`iox::cxx::SmartC<int (iox_sem_t*, int, unsigned int), int, iox_sem_t*, int, unsigned int>::SmartC(this=0x00007ffeefbece50, file="/Users/karsten/workspace/ros2/iceoryx_ws/osx/src/iceoryx/iceoryx_utils/source/posix_wrapper/semaphore.cpp", line=243, func="bool iox::posix::Semaphore::init(iox_sem_t *, const int, const unsigned int)", f_function=0x000000010209aeb0, f_mode=0x00007ffeefbece34, f_returnValues=0x00007ffeefbece20, f_ignoredValues=0x00000001020a8f30, f_args=0x0000000106365710, f_args=1, f_args=0)(iox_sem_t*, int, unsigned int), iox::cxx::ReturnMode const&, std::initializer_list<int> const&, std::initializer_list<int> const&, iox_sem_t*, int, unsigned int) at smart_c.inl:99:21
    frame #3: 0x000000010206823d librmw_iceoryx_name_conversion.dylib`iox::cxx::SmartC<int (iox_sem_t*, int, unsigned int), int, iox_sem_t*, int, unsigned int>::SmartC(this=0x00007ffeefbece50, file="/Users/karsten/workspace/ros2/iceoryx_ws/osx/src/iceoryx/iceoryx_utils/source/posix_wrapper/semaphore.cpp", line=243, func="bool iox::posix::Semaphore::init(iox_sem_t *, const int, const unsigned int)", f_function=0x000000010209aeb0, f_mode=0x00007ffeefbece34, f_returnValues=0x00007ffeefbece20, f_ignoredValues=0x00000001020a8f30, f_args=0x0000000106365710, f_args=1, f_args=0)(iox_sem_t*, int, unsigned int), iox::cxx::ReturnMode const&, std::initializer_list<int> const&, std::initializer_list<int> const&, iox_sem_t*, int, unsigned int) at smart_c.inl:101:1
    frame #4: 0x000000010206414c librmw_iceoryx_name_conversion.dylib`iox::cxx::SmartC<int (iox_sem_t*, int, unsigned int), int, iox_sem_t*, int, unsigned int> iox::cxx::makeSmartCImpl<int (file="/Users/karsten/workspace/ros2/iceoryx_ws/osx/src/iceoryx/iceoryx_utils/source/posix_wrapper/semaphore.cpp", line=243, func="bool iox::posix::Semaphore::init(iox_sem_t *, const int, const unsigned int)", f_function=0x000000010209aeb0, f_mode=0x00007ffeefbece34, f_returnValues=0x00007ffeefbece20, f_ignoredValues=0x00000001020a8f30, f_args=0x0000000106365710, f_args=1, f_args=0), int, iox_sem_t*, int, unsigned int>(char const*, int, char const*, int  const(&)(iox_sem_t*, int, unsigned int), iox::cxx::ReturnMode const&, std::initializer_list<int> const&, std::initializer_list<int> const&, iox_sem_t*, int, unsigned int) at smart_c.inl:32:19
    frame #5: 0x00000001020632ed librmw_iceoryx_name_conversion.dylib`iox::posix::Semaphore::init(this=0x00007ffeefbecfe0, handle=0x0000000106365710, pshared=1, value=0) at semaphore.cpp:243:13
    frame #6: 0x0000000102063430 librmw_iceoryx_name_conversion.dylib`iox::posix::Semaphore::Semaphore(this=0x00007ffeefbecfe0, handle=0x0000000106365710, value=0) at semaphore.cpp:200:9
    frame #7: 0x0000000102063493 librmw_iceoryx_name_conversion.dylib`iox::posix::Semaphore::Semaphore(this=0x00007ffeefbecfe0, handle=0x0000000106365710, value=0) at semaphore.cpp:199:1
    frame #8: 0x000000010202a662 librmw_iceoryx_name_conversion.dylib`iox::cxx::expected<iox::posix::Semaphore, iox::posix::SemaphoreError> DesignPattern::Creation<iox::posix::Semaphore, iox::posix::SemaphoreError>::create<iox_sem_t*, int>(args=0x00007ffeefbed0d8, args=0x00007ffeefbed0d4) at creation.hpp:31:23
    frame #9: 0x000000010202a4fe librmw_iceoryx_name_conversion.dylib`iox::popo::ReceiverPort::GetShmSemaphore(this=0x0000000100b00fd8) at receiver_port.cpp:290:40
    frame #10: 0x0000000100d2249c librmw_iceoryx_cpp.dylib`iox::popo::Subscriber_t<iox::popo::ReceiverPort>::getSemaphore(this=0x0000000100b00e60) const at subscriber.inl:222:33
    frame #11: 0x0000000100d22046 librmw_iceoryx_cpp.dylib`::rmw_create_wait_set(context=0x0000000100c01420, max_conditions=2) at rmw_wait_set.cpp:66:33
    frame #12: 0x000000010033200f librcl.dylib`rcl_wait_set_init + 511
    frame #13: 0x0000000100146d2b librclcpp.dylib`rclcpp::Executor::Executor(rclcpp::ExecutorOptions const&) + 779
    frame #14: 0x000000010014e66e librclcpp.dylib`rclcpp::executors::SingleThreadedExecutor::SingleThreadedExecutor(rclcpp::ExecutorOptions const&) + 14
    frame #15: 0x000000010000eb52 talker`main + 322
    frame #16: 0x00007fff5d77f3d5 libdyld.dylib`start + 1
    frame #17: 0x00007fff5d77f3d5 libdyld.dylib`start + 1

Expected result or behaviour: RMW_IMPLEMENTATION=rmw_iceoryx_cpp ros2 run demo_nodes_cpp talker to run successfully.

mossmaurice commented 3 years ago

@elfenpiff Did you have a look on this segfault already?

budrus commented 3 years ago

@Karsten1987 Did you check again after the iceoryx release in August?

Karsten1987 commented 3 years ago

I just did and I still run into a segmentation fault:

 ➭ lldb ~/workspace/ros2/ros2_master/install/lib/demo_nodes_cpp/talker
(lldb) target create "/Users/karsten/workspace/ros2/ros2_master/install/lib/demo_nodes_cpp/talker"
eCurrent executable set to '/Users/karsten/workspace/ros2/ros2_master/install/lib/demo_nodes_cpp/talker' (x86_64).
(lldb) env RMW_IMPLEMENTATION=rmw_iceoryx_cpp
(lldb) r
Process 88808 launched: '/Users/karsten/workspace/ros2/ros2_master/install/lib/demo_nodes_cpp/talker' (x86_64)
1970-01-05 16:34:33.419 [Warning]: MQ still there, doing an unlink of /talker_88808
socket: "/tmp//roudi", timedSend with a timeout != 0 is not supported on MacOS. timedSend will behave like send instead.
1970-01-05 16:34:33.421 [ Debug ]: Application registered management segment 0x110000000 with size 82677696 to id 1
1970-01-05 16:34:33.424 [ Info  ]: Application registered payload segment 0x114ed9000 with size 149655680 to id 2
Process 88808 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x7ff31bd00610)
    frame #0: 0x00000001028871ea librmw_iceoryx_name_conversion.dylib`void std::__1::__cxx_atomic_store<int>(std::__1::__cxx_atomic_base_impl<int>*, int, std::__1::memory_order) + 74
librmw_iceoryx_name_conversion.dylib`std::__1::__cxx_atomic_store<int>:
->  0x1028871ea <+74>: movl   %eax, (%rcx)
    0x1028871ec <+76>: jmp    0x102887208               ; <+104>
    0x1028871f1 <+81>: movl   -0x14(%rbp), %eax
    0x1028871f4 <+84>: movq   -0x20(%rbp), %rcx
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x7ff31bd00610)
  * frame #0: 0x00000001028871ea librmw_iceoryx_name_conversion.dylib`void std::__1::__cxx_atomic_store<int>(std::__1::__cxx_atomic_base_impl<int>*, int, std::__1::memory_order) + 74
    frame #1: 0x0000000102886b64 librmw_iceoryx_name_conversion.dylib`std::__1::__atomic_base<int, false>::store(int, std::__1::memory_order) + 36
    frame #2: 0x0000000102886ae0 librmw_iceoryx_name_conversion.dylib`iox_sem_init(iox_sem_t*, int, unsigned int) + 64
    frame #3: 0x0000000102859852 librmw_iceoryx_name_conversion.dylib`iox::cxx::SmartC<int (iox_sem_t*, int, unsigned int), int, iox_sem_t*, int, unsigned int>::SmartC(char const*, int, char const*, int (&)(iox_sem_t*, int, unsigned int), iox::cxx::ReturnMode const&, std::initializer_list<int> const&, std::initializer_list<int> const&, iox_sem_t*, int, unsigned int) + 162
    frame #4: 0x00000001028596cd librmw_iceoryx_name_conversion.dylib`iox::cxx::SmartC<int (iox_sem_t*, int, unsigned int), int, iox_sem_t*, int, unsigned int>::SmartC(char const*, int, char const*, int (&)(iox_sem_t*, int, unsigned int), iox::cxx::ReturnMode const&, std::initializer_list<int> const&, std::initializer_list<int> const&, iox_sem_t*, int, unsigned int) + 157
    frame #5: 0x0000000102855143 librmw_iceoryx_name_conversion.dylib`iox::cxx::SmartC<int (iox_sem_t*, int, unsigned int), int, iox_sem_t*, int, unsigned int> iox::cxx::makeSmartCImpl<int (iox_sem_t*, int, unsigned int), int, iox_sem_t*, int, unsigned int>(char const*, int, char const*, int  const(&)(iox_sem_t*, int, unsigned int), iox::cxx::ReturnMode const&, std::initializer_list<int> const&, std::initializer_list<int> const&, iox_sem_t*, int, unsigned int) + 403
    frame #6: 0x0000000102854103 librmw_iceoryx_name_conversion.dylib`iox::posix::Semaphore::init(iox_sem_t*, int, unsigned int) + 243
    frame #7: 0x0000000102854399 librmw_iceoryx_name_conversion.dylib`iox::posix::Semaphore::Semaphore(iox::posix::CreateUnnamedSharedMemorySemaphore_t, iox_sem_t*, unsigned int) + 201
    frame #8: 0x0000000102854413 librmw_iceoryx_name_conversion.dylib`iox::posix::Semaphore::Semaphore(iox::posix::CreateUnnamedSharedMemorySemaphore_t, iox_sem_t*, unsigned int) + 35
    frame #9: 0x00000001028266a6 librmw_iceoryx_name_conversion.dylib`iox::cxx::expected<iox::posix::Semaphore, iox::posix::SemaphoreError> DesignPattern::Creation<iox::posix::Semaphore, iox::posix::SemaphoreError>::create<iox::posix::CreateUnnamedSharedMemorySemaphore_t const&, iox_sem_t*, int>(iox::posix::CreateUnnamedSharedMemorySemaphore_t const&, iox_sem_t*&&, int&&) + 150
    frame #10: 0x0000000102826515 librmw_iceoryx_name_conversion.dylib`iox::popo::ReceiverPort::GetShmSemaphore() + 149
    frame #11: 0x0000000101f1f1be librmw_iceoryx_cpp.dylib`iox::popo::Subscriber_t<iox::popo::ReceiverPort>::getSemaphore() const + 30
    frame #12: 0x0000000101f1eda6 librmw_iceoryx_cpp.dylib`rmw_create_wait_set + 902
    frame #13: 0x000000010139d2d5 librmw_implementation.dylib`::rmw_create_wait_set(v2=0x0000000102a00570, v1=2) at functions.cpp:483:1
    frame #14: 0x000000010137125a librcl.dylib`rcl_wait_set_init(wait_set=0x00007ffeefbfd9d8, number_of_subscriptions=0, number_of_guard_conditions=2, number_of_timers=0, number_of_clients=0, number_of_services=0, number_of_events=0, context=0x0000000102a00190, allocator=rcl_allocator_t @ 0x00007ffeefbfa950) at wait.c:163:34
    frame #15: 0x0000000100334a27 librclcpp.dylib`rclcpp::Executor::Executor(this=0x00007ffeefbfd9b8, options=0x00007ffeefbfc610) at executor.cpp:66:9
    frame #16: 0x000000010035ece7 librclcpp.dylib`rclcpp::executors::SingleThreadedExecutor::SingleThreadedExecutor(this=0x00007ffeefbfd9b8, options=0x00007ffeefbfc610) at single_threaded_executor.cpp:22:3
    frame #17: 0x000000010035ed2d librclcpp.dylib`rclcpp::executors::SingleThreadedExecutor::SingleThreadedExecutor(this=0x00007ffeefbfd9b8, options=0x00007ffeefbfc610) at single_threaded_executor.cpp:22:29
    frame #18: 0x00000001000106b0 talker`main(argc=1, argv=0x00007ffeefbfdba8) at node_main_talker.cpp:30:45
    frame #19: 0x00007fff663303d5 libdyld.dylib`start + 1
    frame #20: 0x00007fff663303d5 libdyld.dylib`start + 1

I can't tell if that's due to an outdated RMW though.

As a sidenote, my console output gets flooded with message in the style of:

[ 2 ]  No such file or directory
/Users/karsten/workspace/ros2/iceoryx_ws/osx/src/iceoryx/iceoryx_utils/source/posix_wrapper/unix_domain_socket.cpp:379 { cxx::expected<int32_t, IpcChannelError> iox::posix::UnixDomainSocket::createSocket(const iox::posix::IpcChannelMode) }  :::

The reason being that iox-roudi was not started. There was a warning message being printed at the beginning, but due to the constant console output, it was almost impossible to notice.

marthtz commented 3 years ago

I'm setting up my Macbook this week. Please be patient, I'll have a look!

marthtz commented 3 years ago

@Karsten1987 So, to reproduce it I follow the steps on rmw_iceoryx and then start a ROS2 app? If you can give me some details I can start working on this issue!

budrus commented 3 years ago

With the upcoming iceoryx 0.9 release we break the API and have to adapt rmw_iceoryx. I.e. this error can only be analyzed with iceoryx 0.17.0. My proposal would be to not officially support Mac with rmw_iceoryx based on 0..17.0. We have to ensure that iceoryx is running fine on Mac with the 1.0 release, then we can implement the next generation of the rmw and check that also Mac is working

Fine for all? @Karsten1987 @marthtz

budrus commented 3 years ago

@marthtz. Please check the upcoming 0.9 on your Mac

marthtz commented 3 years ago

It looks like it never worked. Issue is that unnamed shared semaphore is not working. Shared mem semaphores created in RouDi can't be used in apps in the current implementation. That's also visible in the backtrace above. Problem is that there don't seem to be any tests checking shared semaphore usage across multiple processes.

elfenpiff commented 3 years ago

@dkroenke @budrus

We definitely need real inter process tests otherwise we will run again in such an issue - the roudi environment will never catch such issues. At the moment the waitset is NOT working on Mac OS and this needs to be fixed by fixing the mac os semaphore platform implementation. This should be not so time consuming but at the moment I am unable to implement this.

elfenpiff commented 3 years ago

@Karsten1987 @marthtz I think this issue can be closed and it is fixed in master. Could you please verify?

Karsten1987 commented 3 years ago

Do you have instructions for this? The current rmw_iceoryx master doesn't build with iceoryx. I don't really know how to run the ROS2 demo without the rmw.

elfenpiff commented 3 years ago

@Karsten1987 I created two issues in https://github.com/ros2/rmw_iceoryx/issues/41 and https://github.com/ros2/rmw_iceoryx/issues/40 . Additionally, I created a support issue on our side to track those two https://github.com/eclipse-iceoryx/iceoryx/issues/511. Therefore, I would close this issue for now and would continue working on rmw_iceoryx in the other one.

The reason is that I am pretty sure that there was an issue with the shared memory semaphore which caused the bug but it should be fixed now. But we have no way to be sure when rmw_iceoryx is not even building with iceoryx anymore.