ros2 / rclcpp

rclcpp (ROS Client Library for C++)
Apache License 2.0
514 stars 410 forks source link

Revert "call shutdown in LifecycleNode dtor to avoid leaving the devi… (backport #2522) #2524

Closed mergify[bot] closed 1 month ago

mergify[bot] commented 1 month ago

…ce in un… (#2450)"

This reverts commit 04ea0bb00293387791522590b7347a2282cda290.

Ever since this commit was merged, tests have been (quietly) failing in CI. You can see this in any of the CI jobs since April 7th, for instance in https://ci.ros2.org/view/nightly/job/nightly_linux_release/3068/:

test 7
      Start  7: test_lifecycle_publisher

7: Test command: /usr/bin/python3 "-u" "/home/jenkins-agent/workspace/nightly_linux_release/ws/install/ament_cmake_test/share/ament_cmake_test/cmake/run_test.py" "/home/jenkins-agent/workspace/nightly_linux_release/ws/build/rclcpp_lifecycle/test_results/rclcpp_lifecycle/test_lifecycle_publisher.gtest.xml" "--package-name" "rclcpp_lifecycle" "--output-file" "/home/jenkins-agent/workspace/nightly_linux_release/ws/build/rclcpp_lifecycle/ament_cmake_gtest/test_lifecycle_publisher.txt" "--command" "/home/jenkins-agent/workspace/nightly_linux_release/ws/build/rclcpp_lifecycle/test_lifecycle_publisher" "--gtest_output=xml:/home/jenkins-agent/workspace/nightly_linux_release/ws/build/rclcpp_lifecycle/test_results/rclcpp_lifecycle/test_lifecycle_publisher.gtest.xml"
7: Working Directory: /home/jenkins-agent/workspace/nightly_linux_release/ws/build/rclcpp_lifecycle
7: Test timeout computed to be: 60
7: -- run_test.py: invoking following command in '/home/jenkins-agent/workspace/nightly_linux_release/ws/build/rclcpp_lifecycle':
7:  - /home/jenkins-agent/workspace/nightly_linux_release/ws/build/rclcpp_lifecycle/test_lifecycle_publisher --gtest_output=xml:/home/jenkins-agent/workspace/nightly_linux_release/ws/build/rclcpp_lifecycle/test_results/rclcpp_lifecycle/test_lifecycle_publisher.gtest.xml
7: Running main() from /home/jenkins-agent/workspace/nightly_linux_release/ws/install/gtest_vendor/src/gtest_vendor/src/gtest_main.cc
7: [==========] Running 4 tests from 1 test suite.
7: [----------] Global test environment set-up.
7: [----------] 4 tests from PerTimerType/TestLifecyclePublisher
7: [ RUN      ] PerTimerType/TestLifecyclePublisher.publish_managed_by_node/wall_timer
7: [WARN] [1715154999.943244122] [LifecyclePublisher]: Trying to publish message on the topic '/topic', but the publisher is not activated
7: [ERROR] [1715154999.943921662] [node]: Unable to start transition 6 from current state shuttingdown: Could not publish transition: publisher's context is invalid, at /home/jenkins-agent/workspace/nightly_linux_release/ws/src/ros2/rcl/rcl/src/rcl/publisher.c:423, at /home/jenkins-agent/workspace/nightly_linux_release/ws/src/ros2/rcl/rcl_lifecycle/src/rcl_lifecycle.c:368
7: [WARN] [1715154999.943982879] [rclcpp_lifecycle]: Shutdown error in destruction of LifecycleNode: final state(inactive)
7: [       OK ] PerTimerType/TestLifecyclePublisher.publish_managed_by_node/wall_timer (35 ms)
7: [ RUN      ] PerTimerType/TestLifecyclePublisher.publish_managed_by_node/generic_timer
7: [WARN] [1715154999.955733453] [LifecyclePublisher]: Trying to publish message on the topic '/topic', but the publisher is not activated
7: [ERROR] [1715154999.956197503] [node]: Unable to start transition 6 from current state shuttingdown: Could not publish transition: publisher's context is invalid, at /home/jenkins-agent/workspace/nightly_linux_release/ws/src/ros2/rcl/rcl/src/rcl/publisher.c:423, at /home/jenkins-agent/workspace/nightly_linux_release/ws/src/ros2/rcl/rcl_lifecycle/src/rcl_lifecycle.c:368
7: [WARN] [1715154999.956255396] [rclcpp_lifecycle]: Shutdown error in destruction of LifecycleNode: final state(inactive)
7: [       OK ] PerTimerType/TestLifecyclePublisher.publish_managed_by_node/generic_timer (11 ms)
7: [ RUN      ] PerTimerType/TestLifecyclePublisher.publish/wall_timer
7: [WARN] [1715154999.965462012] [LifecyclePublisher]: Trying to publish message on the topic '/topic', but the publisher is not activated
7: [ERROR] [1715154999.967268333] [node]: Unable to start transition 5 from current state shuttingdown: Could not publish transition: publisher's context is invalid, at /home/jenkins-agent/workspace/nightly_linux_release/ws/src/ros2/rcl/rcl/src/rcl/publisher.c:423, at /home/jenkins-agent/workspace/nightly_linux_release/ws/src/ros2/rcl/rcl_lifecycle/src/rcl_lifecycle.c:368
7: [WARN] [1715154999.967322508] [rclcpp_lifecycle]: Shutdown error in destruction of LifecycleNode: final state(unconfigured)
7: [       OK ] PerTimerType/TestLifecyclePublisher.publish/wall_timer (12 ms)
7: [ RUN      ] PerTimerType/TestLifecyclePublisher.publish/generic_timer
7: [WARN] [1715154999.977546208] [LifecyclePublisher]: Trying to publish message on the topic '/topic', but the publisher is not activated
7: [ERROR] [1715154999.977927756] [node]: Unable to start transition 5 from current state shuttingdown: Could not publish transition: publisher's context is invalid, at /home/jenkins-agent/workspace/nightly_linux_release/ws/src/ros2/rcl/rcl/src/rcl/publisher.c:423, at /home/jenkins-agent/workspace/nightly_linux_release/ws/src/ros2/rcl/rcl_lifecycle/src/rcl_lifecycle.c:368
7: [WARN] [1715154999.977936306] [rclcpp_lifecycle]: Shutdown error in destruction of LifecycleNode: final state(unconfigured)
7: [       OK ] PerTimerType/TestLifecyclePublisher.publish/generic_timer (7 ms)
7: [----------] 4 tests from PerTimerType/TestLifecyclePublisher (67 ms total)
7: 
7: [----------] Global test environment tear-down
7: [==========] 4 tests from 1 test suite ran. (67 ms total)
7: [  PASSED  ] 4 tests.

What that should look like (and does after this revert) is:

test 7
    Start 7: test_lifecycle_publisher

7: Test command: /usr/bin/python3 "-u" "/home/ubuntu/rolling_ws/install/ament_cmake_test/share/ament_cmake_test/cmake/run_test.py" "/home/ubuntu/rolling_ws/build/rclcpp_lifecycle/test_results/rclcpp_lifecycle/test_lifecycle_publisher.gtest.xml" "--package-name" "rclcpp_lifecycle" "--output-file" "/home/ubuntu/rolling_ws/build/rclcpp_lifecycle/ament_cmake_gtest/test_lifecycle_publisher.txt" "--command" "/home/ubuntu/rolling_ws/build/rclcpp_lifecycle/test_lifecycle_publisher" "--gtest_output=xml:/home/ubuntu/rolling_ws/build/rclcpp_lifecycle/test_results/rclcpp_lifecycle/test_lifecycle_publisher.gtest.xml"
7: Working Directory: /home/ubuntu/rolling_ws/build/rclcpp_lifecycle
7: Test timeout computed to be: 60
7: -- run_test.py: invoking following command in '/home/ubuntu/rolling_ws/build/rclcpp_lifecycle':
7:  - /home/ubuntu/rolling_ws/build/rclcpp_lifecycle/test_lifecycle_publisher --gtest_output=xml:/home/ubuntu/rolling_ws/build/rclcpp_lifecycle/test_results/rclcpp_lifecycle/test_lifecycle_publisher.gtest.xml
7: Running main() from /home/ubuntu/rolling_ws/install/gtest_vendor/src/gtest_vendor/src/gtest_main.cc
7: [==========] Running 4 tests from 1 test suite.
7: [----------] Global test environment set-up.
7: [----------] 4 tests from PerTimerType/TestLifecyclePublisher
7: [ RUN      ] PerTimerType/TestLifecyclePublisher.publish_managed_by_node/wall_timer
7: [WARN] [1715204443.848895590] [LifecyclePublisher]: Trying to publish message on the topic '/topic', but the publisher is not activated
7: [       OK ] PerTimerType/TestLifecyclePublisher.publish_managed_by_node/wall_timer (60 ms)
7: [ RUN      ] PerTimerType/TestLifecyclePublisher.publish_managed_by_node/generic_timer
7: [WARN] [1715204443.876967893] [LifecyclePublisher]: Trying to publish message on the topic '/topic', but the publisher is not activated
7: [       OK ] PerTimerType/TestLifecyclePublisher.publish_managed_by_node/generic_timer (24 ms)
7: [ RUN      ] PerTimerType/TestLifecyclePublisher.publish/wall_timer
7: [WARN] [1715204443.900725954] [LifecyclePublisher]: Trying to publish message on the topic '/topic', but the publisher is not activated
7: [       OK ] PerTimerType/TestLifecyclePublisher.publish/wall_timer (23 ms)
7: [ RUN      ] PerTimerType/TestLifecyclePublisher.publish/generic_timer
7: [WARN] [1715204443.923513955] [LifecyclePublisher]: Trying to publish message on the topic '/topic', but the publisher is not activated
7: [       OK ] PerTimerType/TestLifecyclePublisher.publish/generic_timer (23 ms)
7: [----------] 4 tests from PerTimerType/TestLifecyclePublisher (130 ms total)
7: 
7: [----------] Global test environment tear-down
7: [==========] 4 tests from 1 test suite ran. (131 ms total)
7: [  PASSED  ] 4 tests.

I'm not sure of this, but I think the problem is that when this is being destroyed, you can't assume that all of the lower-level things have not yet been destroyed, so you shouldn't actually transition. If that is something we expect to be able to do, then we probably need to take some weak or shared pointers to the infrastructure we need so it isn't destroyed.

Regardless, we don't have that in place right now, so we should revert this change. @fujitatomoya @alsora @Barry-Xu-2018 FYI (we should probably revert this from jazzy, iron and humble as well).


This is an automatic backport of pull request #2522 done by Mergify.

clalancette commented 1 month ago

CI: