Closed KKKiwiXU closed 1 year ago
os.system(f"ros2 lifecycle set {name} 1")
I would check if this system shell call returned in success. this actually calls service request to the lifecycle nodes, that means probably 12th client service response path does not exist yet when the server replies to the client.
All my nodes are using intra-process-communication
above system call will not use intra-process-communication, i would use rclpy
API in the application code instead of os.system
.
https://github.com/ros2/ros2/issues/1253 could be a similar problem.
Thanks for answering. I found that the timeout error occured randomly on every node, not only on the 12th node. And I write a c++ code srv to invoke the transit function. But these attemptings didn't work. I remove all the intra-process-comms, and make the #704 change. This composition alleviates the problem. But this is not enough. Now we have more than 50 topics, and this would be even more in the future. I'm appreciate if you could tell me that, is there any method that could strictly specify the publisher and the subscription, which could replace the search and match stage while creating a publisher? This could solve this problem in a better way. Thanks
We think that this may be solved by https://github.com/ros2/rclcpp/pull/2280 . If you can try that one out and see if it improves the situation for you, that would be very helpful. Thanks.
@KKKiwiXU https://github.com/ros2/rclcpp/pull/2280 has been merged to humble, that should fix the problem. I will go ahead to close this, if you still meet the problem, please feel free to reopen. thanks for the posting issue.
I use the latest rclcpp, however , I still face the issue: terminate called after throwing an instance of 'rclcpp::exceptions::RCLError' what(): failed to send response: client will not receive response, at ./src/rmw_response.cpp:154, at ./src/rcl/service.c:314
What can I do to debug it ? As fas as I know. the expection is throwed by service.hpp in rclcpp, am I right?
I use the latest rclcpp, however , I still face the issue:
can you share your distribution and version for rclcpp?
apt-show-versions | grep rclcpp ros-humble-rclcpp:amd64/jammy 16.0.6-1jammy.20230919.213531 uptodate and the command is as following: ros2 run demo_nodes_cpp add_two_ints_server I wonder if I miss sth ?
https://github.com/ros2/rclcpp/pull/2280 is available on rclcpp
version 16.0.6-1jammy.20230919.213531
, so i believe that you are using the correct version. the only thing that i can think of is, it returns some error instead of RCL_RET_TIMEOUT. I would recommend that you can create another issue for that.
Bug report
Hello everyone. I may face a bug. Required Info:
Steps to reproduce issue
I'm working for a project with over 20 process and more than 40 nodes. My process is build in a one-process multi-node framework, which including more than 12 nodes. All these nodes are inherent from rclcpp::LifecycleNode, and are using LCN callback to on_configure, on_activate, etc. to control lifecycle. All my nodes are using intra-process-communication with intra_process_comm=True, using::UniquePtr and publisher->publish(std::move(pub_message)) way to realize zero-copy intra-process-communication. If I only run my own process, all things goes well. LCN control with no error, all nodes subscription and publication right. But if I run other nodes in other containers on my work station, using ros2 bag play playing back a ros2 bag with many sensor data, and using LCN control my node, things goes worth.
It would raises exception:
terminate called after throwing an instance of 'rclcpp::exceptions::RCLError' what(): failed to send response: client will not receive response, at /root/ros2_humble/src/ros2/rmw_fastrtps/rmw_fastrtps_shared_cpp/src/rmw_response.cpp:153, at /root/ros2_humble/src/ros2/rcl/rcl/src/rcl/service.c:314
this error occured while i activating my 12th node, I don't know whether it is related. If i transit state with no message playback, this error may not occur. I use a python scripts which using os.system to control my node:
All nodes in my process are using the same qos profile:
I set the history to 1 because I found that, I have more than one node subscribe to a same node, and a large history may cause message drop.
I have tried https://github.com/ros2/rmw_fastrtps/pull/704 changes on src code, and I proved that I have changed the max_blocking_time by print it. This doesn't work. Thanks.