libcheck / check

A unit testing framework for C
GNU Lesser General Public License v2.1
1.07k stars 210 forks source link

Support for non-fatal assertions? #174

Closed ydahhrk closed 6 years ago

ydahhrk commented 6 years ago

Consider the following code:

START_TEST(foo)
{
    struct bar *barr;

    barr = create_bar();
    ck_assert(barr != NULL);

    do_something_with_bar(barr);

    ck_assert_int_eq(barr->a, 1);
    ck_assert_int_eq(barr->b, 2);
    ck_assert_int_eq(barr->c, 3);

    free(barr);
}
END_TEST

Because Check's asserts abort on failure (why?), if one of the int_eq asserts fail, this test will leak barr. Is this not the case?

Unless I'm missing something, working around this involves code that steadily grows the more asserts are involved:

START_TEST(foo)
{
    struct bar *barr;
    /* Annoying extra memory and variable clutter */
    bool success1, success2, success3;

    barr = create_bar();
    ck_assert(barr != NULL); /* This one is fine */

    do_something_with_bar(barr);

    /* Annoying separation between validations and asserts */
    success1 = barr->a == 1;
    success2 = barr->b == 2;
    success3 = barr->c == 3;

    free(barr);

    /* Annoying nonstandard error messages */
    /*
     * Annoying potential error message obsolescence
     * (eg. rename the `a` field using automatic refactor tools
     * -- they will inevitably forget to update this error msg.)
     */
    ck_assert_msg(success1, "barr->a != 1");
    ck_assert_msg(success2, "barr->b != 2");
    ck_assert_msg(success3, "barr->c != 3");
}
END_TEST

I can't think of a failure case where skipping the free would be a good idea, because barr is (self-evidently successfully) allocated in the heap (by contract), and there's no reason to abort early anyway, because the three int_eq asserts are independent of each other.

Could Check allow temporary non-fatal assertions? Which will mark the entire test as failed but will continue execution regardless. Something like

disable_suicide(); /* Following assertions will not kill the test
            on failure */
ck_assert_int_eq(barr->a, 1);
ck_assert_int_eq(barr->b, 2);
ck_assert_int_eq(barr->c, 3);
enable_suicide(); /* Following assertions will kill the test on
            failure. Test might already be failed
            regardless. */

And/or

tcase_set_fatal_aborts(case, false);
brarcher commented 6 years ago

Because Check's asserts abort on failure (why?)

Could Check allow temporary non-fatal assertions? Which will mark the entire test as failed but will continue execution regardless.

I've not really considered an assertion which 'latches' the bad state until the end of the test before. The concept of an 'assertion' that I have is "If the following condition fails, there is no point in continuing on as something is wrong and the state of the program may not longer be sane". In terms of test design, failing early makes it easier to determine where the failure started. If failures are allowed to accumulate (which assumes that one failure may ultimately lead to other related failures) the later failures are less relevant as they are just downstream consequences to the original failure.

if one of the int_eq asserts fail, this test will leak barr. Is this not the case?

Are you running the tests with fork() mode enabled? If fork() is enabled (which is the default on *nix systems) then each unit test is run within its own memory space. Yes, bar would be leaked, however there would be no long term consequence as the memory for bar would be reclaimed as the process would soon die.

If, however, you are running the unit tests without fork() enabled (the only option for Windows), then the leaked bar would remain until the unit testing program terminates. Those leaks would add up. One would hope (expect?) that for a successful test run that all tests would pass which would result in no leaks.

Could Check allow temporary non-fatal assertions?

Lets entertain this idea. Say there was an API such as you mentioned. If assertions were made non-fatal but multiple assertions did fail for a given unit test: 1) What would be reported for the result of the given unit test? Would it be marked as 'failed'? What error message would be reported? Would it be the first failure, the last failure, or a collection of all failures? 2) If a unit test had assertions temporarily disabled, a failure occurred, then the assertions were re-enabled, what would happen? Would the latched failure state be asserted at that time, immediately terminating the unit test?

As a final question, is it that would are trying to report more data on a failed unit test than simply a check on one piece of data? Is learning about a single failure not providing enough data to diagnose the issue in the code and you are looking for multiple data points from a given failure?

ydahhrk commented 6 years ago

As a final question, is it that would are trying to report more data on a failed unit test than simply a check on one piece of data? Is learning about a single failure not providing enough data to diagnose the issue in the code and you are looking for multiple data points from a given failure?

"If the following condition fails, there is no point in continuing on as something is wrong and the state of the program may not longer be sane". In terms of test design, failing early makes it easier to determine where the failure started. If failures are allowed to accumulate (which assumes that one failure may ultimately lead to other related failures) the later failures are less relevant as they are just downstream consequences to the original failure.

I think that you're assuming that everything you test is stateful.

Which I guess is fair, given that this language's undefined behavior leads to the entire program being meaningless. (Although unit tests will not necessarily catch undefined behavior, and even if they do, the error might be misleadingly observed long after it happened.)

But either way, for whatever reason the routines that have given me the most trouble in the past are the really low-level, performance-critical, deceptively complex and reliable-result kind of functions. The ones in which the failures are independent but follow patterns that are valuable to know while debugging. Take this very dumb example:

public int multiply(int a, int b);

BEGIN_TEST(test_multiply) {
    ck_assert_int_eq( 0, multiply(0,  0));    /* Success */
    ck_assert_int_eq( 0, multiply(0,  1));    /* Success */
    ck_assert_int_eq( 0, multiply(1,  0));    /* Success */
    ck_assert_int_eq( 1, multiply(1,  1));    /* Success */
    ck_assert_int_eq(-1, multiply(1, -1));    /* Error: Expected -1, got 1
    ck_assert_int_eq(-1, multiply(-1, 1));    /* Success */
    ck_assert_int_eq(1, multiply(-1, -1));    /* Error: Expected 1, got 98732498 */
    ck_assert_int_eq(-100, multiply(5, -20)); /* Success*/
    ck_assert_int_eq(25, multiply(-5, -5));   /* Error: Expected 25, got 434334 */
    ck_assert_int_eq(0, multiply(0, 4));      /* Success */
    ck_assert_int_eq(0, multiply(0, -4));     /* Error: Expected 0, got INT_MIN */
    ck_assert_int_eq(0, multiply(0, -6757));  /* Success */
    /* Etc. Assume many more asserts here */
}
END_TEST

-> A fatal run will stop on the first error and likely send the developer on a fool's errand in search for the missing negative operand during sign computation. -> A non-fatal run would inform the developer that the error has little to do with sign and only happens (roughly) when -6 < b =< -1. (And a has nothing to do with it.)

(I'm aware that this argument is not that strong because I can scatter all those asserts to different test functions, but I guess it's worth mentioning that doing this yields much more clutter --I have to name every single one of these functions and manually add them to the test case.)

Are you running the tests with fork() mode enabled? If fork() is enabled (which is the default on *nix systems) then each unit test is run within its own memory space. Yes, bar would be leaked, however there would be no long term consequence as the memory for bar would be reclaimed as the process would soon die.

Ok. This is relieving, but is this behavior specified by some standard?

Are you assuming that the OS will have fancy memory sanitizing features?

I see that POSIX, for example, ensures that file descriptors don't leak, but I don't see a similarly blunt statement about heap memory.

I get that unit tests are not deployed code, and that they are even less troublesome when no failures are detected, but... you're asking users of your framework to rely on undefined behavior (or a workaround I guess) and disregard their memory correctness OCD. No? Isn't this some degree of risky?

Would it be marked as 'failed'?

Of course.

What error message would be reported? Would it be the first failure, the last failure, or a collection of all failures?

A collection of all failures.

If a unit test had assertions temporarily disabled, a failure occurred, then the assertions were re-enabled, what would happen?

(Pedantic note: Assertions are not what should be disableable; fatal abortions are.)

The test would continue running until its block returns or a fatal assertion fails. The test would be considered failed regardless.


Another argument in favor of a non-fatal runs option is that non-fatal assertions are simply more powerful. It's easier to consciously create fatal assertions out of non-fatal assertions...

/* Simplest case */
if (ck_assert(a, b))
    return;

/*
 * Giving additional information to the developer that's bound
 * to be useful.
 */
int error;
error  = ck_assert(c, d);
error |= ck_assert(e, f);
error |= ck_assert(g, h);
if (error)
    return;

/* Alternatively: */
if (ck_assert(c, d) | ck_assert(e, f) | ck_assert(g, h))
    return;

...than create non-fatal assertions out of fatal ones:

bool success1 = a == b;
bool success2 = c == d;
bool success3 = e == f;

if (!success1)
    printf("error1: %d != %d", a, b);
if (!success2)
    printf("error2: %d != %d", c, d);
if (!success3)
    printf("error3: %d != %d", e, f);

ck_assert(success1);
ck_assert(success2);
ck_assert(success3);

One more thing to add:

I think that many consider JUnit 4 to be the quintessential reference framework for unit testing. Whether you agree or not, I would like to pose the question of whether Check inherited its design philosophy (perhaps unknowingly) without thinking of the target language.

JUnit is a framework written in Java. And its assertions are fatal; I won't deny that. But I think there's an important distinction here: JUnit implements fatal assertions through Exceptions; not aborts. They are rather different. Exception-based test suicide is clean design because the finally keyword exists:

InputStream file = new FileInputStream("potato.txt");
try {
    TestCase.assertEquals(1, doSomething(file));
} finally {
    /* Will be executed regardless of the assertion verdict. */
    file.close();
}

It's beautiful Java, but far from idiomatic C.

That's why I feel that a testing framework written in C should acknowledge the fact that it's sitting on a language where exceptions do not exist and all exits are still expected by good practice to be resource-sensible.

bombadil commented 6 years ago

If fork() is enabled (which is the default on *nix systems) then each unit test is run within its own memory space. Yes, bar would be leaked, however there would be no long term consequence as the memory for bar would be reclaimed as the process would soon die.

Ok. This is relieving, but is this behavior specified by some standard?

Are you assuming that the OS will have fancy memory sanitizing features?

Reclaiming all the memory of a terminated process is a not so fancy but pretty basic OS feature. I would consider it a bug in the OS if this is not done.

ydahhrk commented 6 years ago

Ok. I guess that's fair. And it looks like my other arguments didn't trigger much chemistry.

Thank you for your time anyway.

brarcher commented 6 years ago

I would like to pose the question of whether Check inherited its design philosophy (perhaps unknowingly) without thinking of the target language.

I would suspect that its design does inherit from JUnit, or at least its philosophy. Looking around at other alternatives I was unable to find a framework which did not fail on the first assertion. It appears that a similar design philosophy is used in them.

I will not disagree that having something like non-fatal checks would be valuable in some use cases. If adding such support to Check made sense, it would likely be the work of another contributor from the community and not from myself.

Thank you for your time anyway.

No worries. Thanks for the idea.

ydahhrk commented 6 years ago

I will not disagree that having something like non-fatal checks would be valuable in some use cases. If adding such support to Check made sense, it would likely be the work of another contributor from the community and not from myself.

Ok. I wouldn't mind working on a pull request, but it also seems that nobody other than me would use the feature as things stand.

If this ever gets to a point where the feature is worth the extra code, give me a call.