foxglove / ros-foxglove-bridge

Foxglove WebSocket bridge for ROS 1 and ROS 2
MIT License
154 stars 67 forks source link

Assertion error on read_callback_ in roscpp for ROS1 bridge #306

Closed SoftwareApe closed 2 months ago

SoftwareApe commented 3 months ago

Description

We're getting an assertion error when running the ROS1 bridge

FATAL] [1716982265.933923467]: ASSERTION FAILED
        file = ./clients/roscpp/src/libros/connection.cpp
        line = 274
        cond = !read_callback_

Steps To Reproduce

Compile and start the ROS1 bridge.

Expected Behavior

This assertion should not be triggered.

We found that when debugging the assertion is not immediately triggered. If we add "sufficient" sleep before the call to getConnection(), the assertion is not triggered, and the ROS1 bridge works, we can get a connection. However at some point later in time we still get the same assertion.

--- ros1_foxglove_bridge/src/service_utils.cpp
+++ ros1_foxglove_bridge/src/service_utils.cpp
@@ -1,5 +1,6 @@
 #include <chrono>
 #include <future>
+#include <thread>

 #include <ros/connection.h>
 #include <ros/service_manager.h>
@@ -12,6 +13,10 @@ namespace foxglove_bridge {
 std::string retrieveServiceType(const std::string& serviceName, std::chrono::milliseconds timeout) {
   auto link = ros::ServiceManager::instance()->createServiceServerLink(serviceName, false, "*", "*",
                                                                        {{"probe", "1"}});
+
+  // Wait for connection to be ready
+  std::this_thread::sleep_for(std::chrono::milliseconds(100));
+
   if (!link) {
     throw std::runtime_error("Failed to create service link");
   } else if (!link->getConnection()) {
linear[bot] commented 3 months ago

FG-7877 Assertion error on read_callback_ in roscpp for ROS1 bridge

achim-k commented 2 months ago

I have trouble reproducing this issue. Is there anything special about your setup e.g. are service servers running on another machine? Do you see any Timed out when retrieving service type error messages?

achim-k commented 2 months ago

@SoftwareApe gentle ping

jurevreca12 commented 2 months ago

@achim-k Hi, I am getting the same error. However, I am running on ARM (zcu104 board). Its been a while since I tried running it, so I don't know any specifics anymore. But I can try running it again tomorrow and see.

SoftwareApe commented 2 months ago

I have trouble reproducing this issue. Is there anything special about your setup e.g. are service servers running on another machine? Do you see any Timed out when retrieving service type error messages?

Sorry, I was on the go the last weeks. Currently everything is running on the same dev machine, so there's only the loopback connection in between.

I will need to check with the service type messages, a colleague of mine took over this integration, but I do think there were some messages of that sort.

I was thinking maybe it's the specific version of roscpp being used. Do you even have this asaertion on line 274 in your libros/connection.cpp?

achim-k commented 2 months ago

Yes, that assertion is on latest noetic: https://github.com/ros/ros_comm/blob/845f74602c7464e08ef5ac6fd9e26c97d0fe42c9/clients/roscpp/src/libros/connection.cpp#L274

It's strange because it checks if the read_callback_ is already set, and the only place it is set is 2 lines below. So apparently there has been a read() before?

SoftwareApe commented 2 months ago

@achim-k I checked and yes we're getting e.g. Failed to retrieve service type or service description of service /rosout/get_loggers: Timed out when retrieving service_type.

jurevreca12 commented 2 months ago

This is the error I am getting: image

foxglove_bridge-20-stdout.log image

foxglove nodelet log is empty.

Is there anything else I can try/report?

achim-k commented 2 months ago

@jurevreca12 are you also seeing Failed to retrieve service type or service description of service errror messages? Maybe try launching with roslaunch --screen to force all log output to the screen.

achim-k commented 2 months ago

If you are not calling services from foxglove, you can set the capabilities parameter to not include the services capability https://github.com/foxglove/ros-foxglove-bridge/blob/9bc30200524f55de6dab1a73c23a617218f16dcc/ros1_foxglove_bridge/launch/foxglove_bridge.launch#L15

This disables service discoverty and prevents the node from crashing.

jurevreca12 commented 2 months ago

@jurevreca12 are you also seeing Failed to retrieve service type or service description of service errror messages? Maybe try launching with roslaunch --screen to force all log output to the screen.

I am not seeing this message no. I launched with "roslaunch agv_init agv_websocket.launch &> log.txt" and here is the log file if it helps any log.txt

achim-k commented 2 months ago

Can you run it with --screen, so roslaunch --screen agv_init agv_websocket.launch &> log.txt ?

jurevreca12 commented 2 months ago

Can you run it with --screen, so roslaunch --screen agv_init agv_websocket.launch &> log.txt ?

Sorry, I mis-wrote. I did run it this way.

jurevreca12 commented 2 months ago

If you are not calling services from foxglove, you can set the capabilities parameter to not include the services capability

https://github.com/foxglove/ros-foxglove-bridge/blob/9bc30200524f55de6dab1a73c23a617218f16dcc/ros1_foxglove_bridge/launch/foxglove_bridge.launch#L15

This disables service discoverty and prevents the node from crashing.

This did indeed work. It does not crash now. Thanks. Services would be nice though.

achim-k commented 2 months ago

@SoftwareApe or @jurevreca12 could you give #316 a try and verify if it fixes the assertion crash? :pray:

jurevreca12 commented 2 months ago

Just so you know. I am working on trying out. I have some stuff I need to take care of first, but I will test it by today.

jurevreca12 commented 2 months ago

Yes. it works now :-)

achim-k commented 2 months ago

awesome, thanks for testing!