Open ahendrix opened 11 years ago
[gerkey] There was an older ticket against rostest for what looks like the same issue: https://code.ros.org/trac/ros/ticket/2462
The conclusion there was that the wrong version of Python's coverage module was in use. I've verified that on the machine running the build where this hang occurs, coverage 3.2b4, from easy_install, is in use.
[kwc] The current coverage implementation does some mean things in order get better coverage (anything that occurs before the coverage start, e.g. an import), is missed. In particular, it reloads the module(s).
It's possible this causes a difference in test behavior, though in the past it I have only seen it cause incorrect test failures (e.g. due to value comparisons failing with reloaded symbols). It wouldn't explain a test not hanging then hanging.
[gerkey] This is happening again. Here are the hanging processes (http://build.willowgarage.com/job/ros-latest-all-covtest-ubuntu-hardy-x86/14/): {{{ wgsim 1749 0.0 0.0 1772 488 ? S 06:48 0:00 /bin/sh -c cd /home/wgsim/workspace/ros-latest-all-covtest-ubuntu-hardy-x86/ros-pkg/pr2_robot/pr2_camera_synchronizer && python test/test_classes.py --cov --gtest_output=xml:/home/wgsim/workspace/ros-latest-all-covtest-ubuntu-hardy-x86/ros/test/test_results/pr2_camera_synchronizer/test_test_classes.py.xml wgsim 1750 0.0 0.3 22824 10880 ? Sl 06:48 0:00 python test/test_classes.py --cov --gtest_output=xml:/home/wgsim/workspace/ros-latest-all-covtest-ubuntu-hardy-x86/ros/test/test_results/pr2_camera_synchronizer/test_test_classes.py.xml }}}
Here's some version info on python-coverage: {{{ <module 'coverage' from '/usr/lib/python2.5/site-packages/coverage-3.2b4-py2.5-linux-i686.egg/coverage/init.pyc'> }}}
@ZdenekM that build failure does not appear to be related to code-coverage tools in python; it just looks like the build is broken.
I found a coverage build apparently hung (http://build.willowgarage.com/job/ros-latest-all-covtest-ubuntu-hardy-x86/2/), with the following culprit processes: {{{ wgsim 17076 0.0 0.0 1972 824 ? S 07:35 0:00 make -f CMakeFiles/pyunit_test_test_classes.py.dir/build.make CMakeFiles/pyunit_test_test_classes.py.dir/build wgsim 17077 0.0 0.0 1772 484 ? S 07:35 0:00 /bin/sh -c cd /home/wgsim/workspace/ros-latest-all-covtest-ubuntu-hardy-x86/ros-pkg/pr2_robot/pr2_camera_synchronizer && python test/test_classes.py --cov --gtest_output=xml:/home/wgsim/workspace/ros-latest-all-covtest-ubuntu-hardy-x86/ros/test/test_results/pr2_camera_synchronizer/test_test_classes.py.xml wgsim 17078 0.0 0.3 22528 10860 ? Sl 07:35 0:00 python test/test_classes.py --cov --gtest_output=xml:/home/wgsim/workspace/ros-latest-all-covtest-ubuntu-hardy-x86/ros/test/test_results/pr2_camera_synchronizer/test_test_classes.py.xml }}} Killing 17078 (the pyunit test) allowed the build to proceed.
I'm unable to replicate this on my own machine, and it didn't hang on the previous iteration of the coverage build (http://build.willowgarage.com/job/ros-latest-all-covtest-ubuntu-hardy-x86/1/).
This build uses Bullseye's wrappers, but that doesn't seem relevant because this package doesn't seem to do any compiling. More likely relevant is the --cov argument that's being passed to Python to turn on Python's coverage module.
Not sure how to track this down.
I'll note that https://code.ros.org/trac/ros/ticket/1629 would prevent the hang, mostly making this problem much less serious.
trac data: