Normalize use of the word "test" in the docs.

schwern commented 12 years ago

"Test" is an overloaded term which could mean many things.

In the Test::Builder2 docs it should refer to the test program. That is, the set of asserts, events and results which makes up a complete test, usually one .t file.

Things like ok and is are "asserts". They produce "results".

This is pretty big, so you're welcome to submit it a piece at a time.

tborisova commented 12 years ago

Hi i want to work on that task for GCI. What exactly should i do?

Xiong commented 11 years ago

I have banned, in my own work, the use of the isolated word "test", due to excessive overloading. The word itself, by itself, means nothing specific any longer.

Here's my jargon, mostly following accepted practice:

A test check is any single reportable comparison, e.g., ok( $got eq $want ).
A test case represents some combination of target code to be executed, arguments to be passed in to target code, results got of that execution, and expectations want'ed of those results, together with control or status info. Each case is submitted to several checks.
A test script is a file, e.g., 123-my-unit.t. Each script contains several cases.
A test battery is some dir containing several scripts. E.g., t/unit or xt/err.
A test suite is all of the batteries written for a given target.

The above form a strictly nested hierarchy.

Code under test is the target: target unit, target module, target functional group, target distribution.
A test template can be filled in and expanded into a script.
A test framework consists of the modules that support the testing activity. This is typically My::Project::Test and, of course, Test::More and friends.
A test harness is, e.g., TAP::Harness, including $ prove.

Other:

Harness, framework, suite, batteries, scripts, cases, and checks may be run. Other verbs are acceptable but I really do try to steer clear of 'to test'.
Scripts, batteries, and suites are proven.
Cases are executed. Results actually obtained from an execution are got; those expected are want; and these results are checked.

Implementation note: I invariably compose all checks for a given case into a single call to subtest{}.

Apologies to the mother tongue for expedient abuse.

pdl commented 11 years ago

Having an agreed vocabulary that forms part of the docs is important and possibly necessary to acheiving consistency. To Xiong's comment (which looks good to me), I'd ask what we call the executions of the test. I often find myself searching for verbal nouns, such as execution or run to describe the thing that is happening when perl is actually diving into Test::More::ok(). I think it's ok for these to be implementation-specific because we'd be likely to use them to make the documents describing the backend more comprehensible.

schwern commented 11 years ago

Very thorough! Particularly laying out all the various things which are called a "test".

What do you think of these changes and clarifications?

check -> assert which appears to be what's used in the xUnit world, which is to say everybody who is not Perl.
Could you give an example of a case? Would that be like a block of related asserts and supporting code? Is a subtest an example of a case? How about this idiom...

note "stat/lstat with no file"; {
    my $file = "i/do/not/exist";
    ok exception { path( $file )->stat };
    ok exception { path( $file )->lstat };
}

battery is like an artillery battery. I get it, but I'm not sure the analogy will work for most. sub-suite?
framework has some specific implications vs just a library. xUnit folks refer to the test framework, for example jUnit, but in their world the test library includes the harness. Test::More doesn't meet what Wikipedia calls a framework.
What does it mean to prove a test suite?
Do we need a different verb for cases? Can't they be run, too?
I know we currently say got/expected, but I've been moving to have/want mostly because they're easier to line up. I'm ok with keeping got/expected until we actually change our output to have/want.
What does it mean for a result to be checked?
I'll add one more, asserts have results.

Ovid commented 11 years ago

From: Michael G. Schwern notifications@github.com

Very thorough! Particularly laying out all the various things which are called a "test". What do you think of these changes and clarifications? * check -> assert which appears to be what's used in the xUnit world, which is to say everybody who is not Perl. * Could you give an example of a case? Would that be like a block of related asserts and supporting code? Is a subtest an example of a case? How about this idiom... note "stat/lstat with no file"; { my $file = "i/do/not/exist"; ok exception { path( $file )->stat }; ok exception { path( $file )->lstat };

In my testing world, a subtest is indeed a case:

subtest 'new user without cookie' => sub { ... }; subtest 'new user with do not track header' => sub { ... }; subtest 'returning user with do not track header' => sub { ... }; subtest 'returning user whose language switched from fr to en' => sub { ... };

In each of those subtests, I have a series of 'tests' (assertions) whose names explain what should happen in that case. For a well-written subtest description, having the name of the subtest diag'ed (is that a word?) at the top of the subtest would make it even more helpful:

# new user without cookie 1..3 ok 1 - new users do not have profiles ok 2 - ... and thus capping rules do not apply ok 3 - ... and blah, blah, blah comes into effect ok 1 - new user without cookie

If you have an extremely long subtest, having to read to the bottom of the subtest to understand the use case can be confusing at times.

Cheers,

Ovid

Twitter - http://twitter.com/OvidPerl/ Buy my book - http://bit.ly/beginning_perl Buy my other book - http://www.oreilly.com/catalog/perlhks/ Live and work overseas - http://www.overseas-exile.com/

Xiong commented 11 years ago

What do you think of these changes and clarifications?

I think I'm flattered you took any notice of my thoughts. More? The proper place for such stuff is the chad bin.

assert

No. An assertion is beyond question. Perhaps a check that triggers a BAILOUT merits a stronger word than check.

case

As Ovid says; a group of attributes of a single execution of a target. Example:

$self->{case}{ teddy_bark       }   = {
    sort    => 3,
    sub     => sub {
        Acme::Teddy::roar();
    },
    args    => qw| bark |,
    want    => {
        return_is       => 'Woof!',
        quiet           => 1,
    },
};  ## case

This generates, (unrelated example, sorry potato):

# ---- Error-Base-Cookbook: avoid-death-un-err
    ok 1 - execute
    ok 2 - should-return
    ok 3 - warning-like
    ok 4 - no-stdout
    1..4
ok 7 - Error-Base-Cookbook: avoid-death-un-err
# ---- Error-Base-Cookbook: avoid-death-tommy
    ok 1 - execute
    ok 2 - should-return
    ok 3 - return-like
    ok 4 - return-quietly
    1..4
ok 8 - Error-Base-Cookbook: avoid-death-tommy

battery

Yes. The act of testing is hostile; the intent is to defeat the target code. "Sheaf"? "Quiver"?

framework

Don't care what the xUnit folks say. Don't care about this jargon. Test::More and TAP::Harness are invariant. "Thingy"? Throne.

prove

The target's suite has been proven when prove runs all its shipping test scripts (t, not xt) and all PASS. A target cannot itself be proven to be correct.

case

Surely a case can be run. A case executes or is executed (without distinction) and then is checked. I see no reason to conflate the phases.

have / want

Absolutely. I cannot thank you enough.

result checked

A result is not checked; a case is checked and foreach check, have compared to want. In some checks the comparison is implicit:

want    => {
    quiet   => 1,
},

There is no have from this case's execution containing the value 1. This line demands that the quiet() checker determine nothing was emitted to STDOUT or STDERR.

The word "result" itself is as generic as "return value" and says nothing particular. You already have have, a perfectly good word for the concept.

assert results

Still no. You want to have your cake and give it to the monkey, too. If it is truly an assertion then it is asserted and beyond question; there are no results since no operation or comparison has taken place.

$i  = sqrt -1;      # I generally call this an assignment.

Best practice in production code is to write in such a way that fatal errors are thrown early and often:

print {$logfh} "I'm feeling itchy." 
    or die "Scratchy: Scratch disk full.";

given ($platonic) {
    when (/tetrahedron/)    { do_tet($_) }
    when (/cube/)           { do_box($_) }
    when (/octahedron/)     { do_oct($_) }
    when (/dodecahedron/)   { do_doc($_) }
    when (/icosahedron/)    { do_ico($_) }
    default                 { die "What do you think I am, a sphere?"}
};

... I call these "sanity checks". What programmers mean by "assertion" is similar; I dislke both the abuse of English and the code syntax employed:

$boy :eq 'Bob';
$boy eq 'Bob' or die "Impossible error! The boy is always Bob.";

If you golf off five characters to avoid writing a literate error message then you're writing code for show, not for production. This is like poetry for an audience of poets.

The logic of a test script should never be so complex that it requires sanity checks. The test module or thingy may indeed be quite complex and be stuffed full of sanity checks but these do not exist from the user's viewpoint.

If the user of the test thingy, who is the author of the test script, feels a certain check is so critical that proving cannot proceed in the event of failure; then he must call BAILOUT in the checker or assign the BAILOUT attribute to the check. The term of art attaching to this is... bailout. If you make the '_' optional, I will give you all the spare change in my Volvo's ashtray.

Xiong commented 11 years ago

I'm not strongly attached to battery as the term for a directory of test scripts; but I worry about choosing truly unusual words. The intent is to be clear about what one is doing, not to obscure it.

While I think it's perfectly rational to call a test script a script I'm having second thoughts about my assumption that a group of cases is invariably a script. Perhaps another grouping word is demanded. Ah, cluster?

Xiong commented 11 years ago

fixture

There: no conflict with any use of the word 'framework'; short; and parallel to a standard term in hardware testing.

Ovid commented 11 years ago

The act of testing is hostile; the intent is to defeat the target code.

I have to admit, that I really, really like this description and it helps to create an appropriate mindset.

Xiong commented 11 years ago

Thank you, Ovid. I've been thinking overtime on jargon issues, especially since being stimulated by you and Mike Schwern. Apologies if I've given any offense; I see no advantage to muddying what I write with endless "Don't you think?" clauses. Beware all my thinking along these lines is more or less tentative.

More thoughts:

report

Statement of correctness ( pass/fail, good/bad ) from any element of the framework.

diagnostic

Message intended to assist in finding causes of a fail or to indicate progress of a proof.

... Therefore "diagnostic report" is an excluded phrase.

results

As I wrote earlier, the word's area of effect has so broadened that it requires a modifier to regain specificity. I'm happy with it as a grammatical tool but not as a descriptor. There are several important classes of results.

have/want, outcome/demand

In code, Mike's employment of have/want is vastly superior to prior consensus and to my own previous habit. In documentation, I find it clumsy to write "We obtain a have", etc. So:

ok( $have eq $want );            # pass if outcome is equal to demand

pass/fail

Correct/incorrect outcome; TAP 'ok'/'not ok'.

good/bad

Correct/incorrect mechanism (function). Not bijunctive with pass/fail.

battery, cluster, sheaf

I refuse to get excited about grouping words -- Mike is probably right that "battery" will not always trigger the right chain of thought... but then, what would? Let us say that these generic grouping words ought not be used in isolation; the things grouped must be mentioned: Battery of scripts, cluster of cases, sheaf of checks. I am opposed firmly to "set".

framework/harness/fixture/suite/target

The target is the production code to be tested. The framework includes harness, fixture, and suite. The harness is invariant (for some value of "invariant"); all users employ the same harness. The suite is written to reflect a specific target (or if you adhere to TDD, the target is written to conform to the suite) and if every check in every case in every script in the suite reports pass then the suite and target are proven good. The fixture is semi-custom, neither fully general nor fully custom; it connects the suite to the harness.

The suite is immediately responsible for execution; the fixture is primarily responsible; the harness is ultimately responsible. In our context, I believe execution of the target must always be in the passive voice: the target has no initiative.

Xiong commented 11 years ago

I'm still unhappy with "cluster of cases". I say a {group} of cases is defined by its TAP plan, e.g., '1..7'. Also, some attributes may be defined over the entire group. For instance, all cases in the group may be declared "quiet", to produce neither STDOUT nor STDERR.

Of course if TAP::Harness is not the consumer of output then there may not be any plan printed. I'm uncomfortable with "plan of cases". An innocent bystander suggested "body of cases" and denied practicing law.

I was less interested in this particular entity before I realized that this thing was the correct location of the test counter, needed to output TAP plan. Needs a name.

From TAP::Harness's viewpoint there can be only one such group per script. Script of cases ?

Test-More / test-more