freebsd / kyua

Port/Package build and test system
https://github.com/freebsd/kyua/wiki
BSD 3-Clause "New" or "Revised" License
149 stars 42 forks source link

Implement race-condition test cases #27

Open jmmv opened 10 years ago

jmmv commented 10 years ago

From jmmv@google.com on September 18, 2011 14:40:37

Test cases that attempt to reproduce race conditions will sometimes pass and sometimes fail. If the race condition is known to exist, we want this to behave as an "unexpected failure". Among others, pooka@ and jrouho@ from NetBSD have requested this feature.

We should have a way to represent such tests in a succinct way with support from ATF and Kyua. For example, we should be able to specify how many iterations of the test we want to run before considering it "pass" if the race does not trigger (and this should probably be a user-tunable setting). Similarly, the idiom to say "pass if all goes well, report as expected failure if it fails" should be easy to express -- and at the moment is not.

Original issue: http://code.google.com/p/kyua/issues/detail?id=27

jmmv commented 10 years ago

From jmmv@google.com on September 19, 2011 05:58:24

There is more notes about this here: http://mtn-host.prjek.net/viewmtn/atf/revision/file/7ae5fa10200ec66c26f08c27e4aeaf3facf3f031/TODO

jmmv commented 10 years ago

From julio@meroh.net on October 11, 2013 08:23:43

Copying the referenced here.

gcooper:

I would think that stress/negative tests would be of more value than race condition tests (they're similar, but not exactly the same in my mind).

In particular,

  1. Feed through as much data as possible to determine where reporting breaks down.
  2. Feed through data quickly and terminate ASAP. The data should be captured. Terminate child applications with unexpected exit codes and signals (in particular, SIGCHLD, SIGPIPE, exit codes that terminate, etc).
  3. Open up a file descriptor in the test application, don't close the file descriptor.
  4. fork(2) a process; don't wait(2) for the application to complete.

There are other scenarios that could be exercised, but these are the ones I could think of off the topic of my head.

jmmv:

  1. The thing is: how do you express any of this in a portable/abstract interface? How do you express that a test case "receives data"? What does that exactly mean? I don't think the framework should care about this: each test should be free to decide where its data is and how to deal with it.
  2. Ditto.
  3. Not sure I understand your request, but testing for "unexpected exit codes" is already supported. See wiki:DesignXFail for the feature design details.
  4. What's the problem with this case? The test case exits right away after terminating the execution of its body; any open file descriptors, leaked memory, etc. die with it.
  5. forking and not waiting for a subprocess was a problem already addressed.

I kinda have an idea of what Antti means with "race condition tests", but every time I have tried to describe my understanding of matters I seem to be wrong. Would be nice to have a clear description of what this involves; in particular, what are the expectations from the framework and how should the feature be exposed.

As of now, what I understand by "race condition test" is: a test case that exercises a race condition. The test case may finish without triggering the race, in which case it just exists with a successful status. Otherwise, if the race triggers, the test case gets stuck and times out. The result should be reported as an "expected failure" different from timeout.

pooka:

Yup. Plus some atf-wide mechanism for the operator to supply some kind of guideline if the test should try to trigger the race for a second or for an hour.

jmmv:

Alright. While mocking up some code for this, I think that your two requests are complementary.

On the one hand, when you are talking about a "race condition" test you really mean an "expected race condition" test. Correct? If so, we need to extend the xfail mechanism to add one more case, which is to report any failures as a race condition error and, if there is no failure, report the test as successful.

On the other hand, the atf-wide mechanism to support how long the test should run for can be thought as a "stress test" mechanism. I.e. run this test for X time / iterations and report its results regularly without involving xfail at all.

So, with this in mind:

These stress test cases implement a single iteration of the test and atf-run is in charge of running the test several times, stopping on the first failure.

Does that make sense?


Implement ATF_REQUIRE_ERRNO

pooka:

Most of the lines in tests against system functionality are:

if (syscall(args) == -1) atf_tc_fail_errno("flop")

Some shorthand would be helpful, like ATF_REQUIRE_ERRNO(syscall(args)) Also, a variant which allows arbitrary return value checks (e.g. "!= 0" or "< 124" or "!= size") would be nice.

gcooper:

There's a problem with this request; not all functions fail in the same way ... in particular compare the pthread family of functions (which return errno) vs many native syscalls. Furthermore, compare some fcntl-like syscalls vs other syscalls. One size fits all solutions may not be a wise idea in this case, so I think that the problem statement needs to be better defined, because the above request is too loose.

FWIW, there's also a TEST macro in LTP, which tests for non-zero status, and sets an appropriate set of global variables for errnos and return codes, respectively. It was a good idea, but has been mostly abandoned because it's too difficult to define a success and failure in a universal manner, so I think that we need to be careful with what's implemented in ATF to not repeat the mistakes that others have made.

jmmv:

I think you've got a good point.

This was mostly intended to simplify the handling of the stupid errno global variable. I think this is valuable to have, but maybe the macro/function name should be different because _ERRNO can be confusing. Probably something like an ATF_CHECK_LIBC / ATF_CHECK_PTHREAD approach would be more flexible and simple.