code-iai / ROSIntegration

Unreal Engine Plugin to enable ROS Support
MIT License
411 stars 132 forks source link

Errors in ROS when publishing messages in Unreal #74

Open jteuber opened 5 years ago

jteuber commented 5 years ago

Hi there!

for a while now I was sometimes getting these kinds of error messages from nodes that subscribe to topics on which I publish messages with ROSIntegration.

Most of the time it's something like this

[ERROR] [1552475202.508205515]: a message of over a gigabyte was predicted in tcpros. that seems highly unlikely, so I'll assume protocol synchronization is lost.
[ERROR] [1552475202.508506874]: Exception thrown when deserializing message of length [0] from [/rosbridge_tcp]: Buffer Overrun
[ERROR] [1552475202.508568769]: Exception thrown when deserializing message of length [19] from [/rosbridge_tcp]: Buffer Overrun

Sometimes something like this happens

[ERROR] [1552418004.467198903]: Exception thrown when deserializing message of length [0] from [/rosbridge_tcp]: Buffer Overrun
[ERROR] [1552418004.467337097]: Exception thrown when deserializing message of length [6] from [/rosbridge_tcp]: Buffer Overrun
[ERROR] [1552418004.475854]: [Client 0] [id: publish:/ugv_0/odometry:43] publish: integer out of range for 'I' format code

And sometimes a mixture of both. Because of that last error I already tried to find an integer in the header of nav_msgs/Odometry that is maybe out of bounds. But the only integers there are in the timestamp which I can guarantee is not out of bounds and the sequence ID which, as I learned today is discarded and refilled by ROS anyway (https://github.com/ros2/common_interfaces/issues/1#issuecomment-112621348). The errors mostly happen in the first few seconds of the nodes execution and crash the node. If it runs for about 5 seconds without crash it doesn't crash afterward as far as I can tell. I have tried to run a node that throws these errors in gdb to debug it, but the exceptions are caught somewhere else and I don't get a chance to see where they come from. I also can't find anything relevant when googling for the error messages. Do you have any idea where this comes from or where to start debugging? Thanks in advance!

Sanic commented 5 years ago

Hi!

Is this maybe a problem in rosbridge? Are any errors in the rosbridge console? Does it help to have rosbridge in another version or on another host? Is the same problem happening with other message types or is it only related to the odometry message? If you wanna hook into the data rosbridge is getting from ROSIntegration, you could try to get some information by hacking into the rosbridge tcp handler code: https://github.com/RobotWebTools/rosbridge_suite/blob/develop/rosbridge_server/src/rosbridge_server/tcp_handler.py . This is where i usually looked for protocol issues between UE4 and rosbridge.

jteuber commented 5 years ago

Thanks for your answer, it helped me dig a bit deeper. Unfortunately, it continues to be weird. It doesn't seem to be rosbridge, as far as I can tell it goes deeper than that. Consider the following screenshot: grafik The top-left terminal is the rosbridge. I tested the one that comes with Ubuntu and ROS Kinetic Kame and the current develop and master branches from github. It all produces the same results. All other terminals are nodes run individually to nail down the problem. On the right, I have two nodes that get odometry as input and one that gets a list of poses as input. (So the problem might be somewhere in the pose message, I can't find it though). On the left is one node that gets IMU messages as input and another one that only uses a header message. So those seem fine, I never got an error with those. This is all in a virtual machine, so transport should not be an issue.

samkys commented 5 years ago

What does your code look like where you actually populate your messages in UE4? Looking at the screenshots of the terminals is hard to follow what you are trying to do even after reading the comment.

To debug, avoid all nodes to make it easier. Launch only the minimum on the ROS side: roslaunch rosbridge_server rosbridge_tcp.launch bson_only_mode:=True

Then open only one terminal and run: rostopic echo /yourTopic

In UE4 only publish one topic: /yourTopic

This makes it easier for you as well as whoever else reads your question, for example me.

jteuber commented 5 years ago

That is actually one of the problems that make it really hard to debug, the errors don't show up when I only publish to any one topic. The first time I had them was when 3 different topics with different message types were fed simultaneously in Unreal. And even then it's quite indeterministic if they show up. I'm currently chasing a deadline, so I don't have a lot of time to invest in this. But when that deadline is gone I'll try to get to the bottom of this and create a minimal breaking example.

samkys commented 5 years ago

Sounds good. I haven't seen that particular issue and have had success publishing to 7 topics so far, so I am curious where that error is produced. When you have time attach your code snippet that fills the ROS message on the Unreal side.