ros / ros_comm

ROS communications-related packages, including core client libraries (roscpp, rospy, roslisp) and graph introspection tools (rostopic, rosnode, rosservice, rosparam).
http://wiki.ros.org/ros_comm
753 stars 910 forks source link

roslaunch exit code 0 even if requried node died with non-zero exit code #919

Open sd1074 opened 7 years ago

sd1074 commented 7 years ago

Currently (at least in my indigo installation), if one of the nodes launched by roslaunch dies due to an error or from a signal, roslaunch is able to catch that and report, but its exit code is still 0. This makes if hard to verify whether roslaunch was successful or not. I am not sure if it's a bug report or a feature request, but I think propagating required node's exit code to ros roslaunch exit code would be a useful imp.

The described behavior was confirmed here: http://answers.ros.org/question/219115/exit-code-of-roslaunch-shutdown-on-node-death/

It can also be easily verified with a simple launch file:

<?xml version="1.0"?>
<launch>
    <node pkg="turtlesim" type="turtlesim_node" name="turtlesim_node" output="screen" required="true" />
</launch>

run it, then kill the node: $ kill -9 <TURTLESIM_NODE_PID>

roslaunch then shuts down with:

================================================================================REQUIRED process [turtlesim_node-1] has died!
process has died [pid 4832, exit code -9, cmd /opt/ros/indigo/lib/turtlesim/turtlesim_node __name:=turtlesim_node __log:=/home/USER/.ros/log/a1339242-9d30-11e6-b60e-9c4e36311474/turtlesim_node-1.log].
log file: /home/sd/.ros/log/a1339242-9d30-11e6-b60e-9c4e36311474/turtlesim_node-1*.log
Initiating shutdown!
================================================================================
[turtlesim_node-1] killing on exit
shutting down processing monitor...
... shutting down processing monitor complete
done

The reported error code is:

$ echo $?
0

I believe that even if not via exit code, there should be some easy mechanism to check whether roslaunch succeeded or failed (based on the required node exit code).

TeeJayR commented 7 years ago

One could directly check the exit code of the started nodes as roslaunch prints the node's pid, but I don`t see a way to do this programmatically and agree that roslaunch should forward the exit code of the node. But what if multiple nodes have been started by roslaunch?

alspitz commented 6 years ago

At least if a required node dies, I think it would be better if roslaunch returned any non-zero exit code instead of 0.

xkortex commented 4 years ago

This makes it challenging to containerize ROS nodes or stacks.

 [nodename] process has died [pid 136, exit code -6 ...

is what I see when a node has an unhandled exception, then the container goes down,

containername- exited with code 0

No obvious way to intercept this, either. Can't trust presence of stderr, as some applications will roslog error but then recover.

LukeAI commented 4 years ago

Was there a conclusion to this? I also want to be able to programmatically check whether or not a roslaunch failed. @dirk-thomas I see you added the enhancement tag - do you know if there is some way of getting the exit code of the child node?

dirk-thomas commented 4 years ago

I assume not at the moment. Please consider to contribute a pull request to enable this.

xkortex commented 4 years ago

I might be able to do it if someone could point roughly in the stack where to look or what keywords to grep for.

lucasw commented 4 years ago

@xkortex I'm also trying to get failed required nodes to make ci jobs fail, and started experimenting with this in https://github.com/lucasw/ros_comm/commit/0eb9ba4400e01cf2383387c3cb2e6a4f5cdff223 . Use with caution- there were some odd cases where roslaunch lost track of child processes but I think I'm avoiding that now.

betaboon commented 3 years ago

@lucasw any update on your branch? is there anything we can do to assist you in getting this finalized/tested? i got here due to the same reason: a simple smoketest in ci :)

lucasw commented 3 years ago

@betaboon I put it into a PR #2082