hidmic commented 4 years ago

Precisely what the title says. As it stands, test_generate_policy.py may fail because of discovery latencies, CLI daemon state spanning multiple tests and even using a non-matching RMW implementation. This circumvents that problem by simply not using it and providing 1 second for discovery to take place (which should be enough for Fast-RTPS at least).

mikaelarguedas commented 4 years ago

Yeah it's a trade-off, I considered removing it when it caused problems a few PRs ago but didn't do it for 2 reasons:

The wait time would need to be greater to account for discovery of various RMWs (e.g. connext is typically slower), otherwise we may end up trading one source of flakiness for another
Using the daemon in sros2 tests allows to find issues with the daemon that we wouldn't find with other packages, for example https://github.com/ros2/ros2cli/pull/501

It's not clear to me that this change is a clear win overall. Maybe it's worth having some tests using it and some tests not using it to have coverage for both scenarios

codecov[bot] commented 4 years ago

Codecov Report

Merging #212 into master will not change coverage. The diff coverage is n/a.

@@           Coverage Diff           @@
##           master     #212   +/-   ##
=======================================
  Coverage   77.95%   77.95%           
=======================================
  Files          16       16           
  Lines         576      576           
  Branches       52       52           
=======================================
  Hits          449      449           
  Misses        107      107           
  Partials       20       20

Flag	Coverage Δ
#unittests	`77.95% <ø> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 57e08d8...70c94ef. Read the comment docs.

hidmic commented 4 years ago

The wait time would need to be greater to account for discovery of various RMWs (e.g. connext is typically slower), otherwise we may end up trading one source of flakiness for another

True, but the daemon doesn't really make things any better in this case i.e. where each test sets up a ROS graph of its own.

Using the daemon in sros2 tests allows to find issues with the daemon that we wouldn't find with other packages, for example ros2/ros2cli#501

True, and I agree we can change this test to cover both cases. But I do see this patch as an improvement, as it avoids cross-talk between tests and ensures these run in a consistent manner.

The alternative would be to explicitly stop an already running daemon at the beginning of the test (which we do in some other tests).

Right, plus an explicit wait for that daemon to discover nodes instantiated within tests.

mikaelarguedas commented 4 years ago

True, but the daemon doesn't really make things any better in this case i.e. where each test sets up a ROS graph of its own.

For sure. Could we run the no-daemon tests on CI using other RMWs to assess what flakiness this may introduce and tune the wait time accordingly ? (c.f. https://github.com/ros2/sros2/pull/168)

True, and I agree we can change this test to cover both cases.

The alternative would be to explicitly stop an already running daemon at the beginning of the test (which we do in some other tests).

Right, plus an explicit wait for that daemon to discover nodes instantiated within tests.

I think this is what we should do here, otherwise we remove test coverage for things we know broke the CLI and trading it for potential CI flakiness reduction.

hidmic commented 4 years ago

Fair enough. I'll mix and match #168 and this PR to test all RMW implementations, with and without a daemon.

hidmic commented 4 years ago

Alright, this has been superseded by #214. Let's continue the discussion there.

ros2 / sros2

Do not use ros2cli daemon in generate_policy tests. #212

Codecov Report