Perl reporter does not handle failing `ok` correctly

hobovsky commented 7 months ago

In Perl tests, it seems that when an ok assertion fails, the reporter fails to emit both <FAILED::> and <COMPLETEDIN::>, what results in garbled up test output:

This issue makes it difficult to create tests which would test for truthy/falsy values.

See this kumite for example: https://www.codewars.com/kumite/654eb0963974cb5737be145c

Assertions like is, isnt seem to create correct reports. I did not check assertions other than is, isnt, and ok.

kazk commented 7 months ago

Perl uses https://github.com/codewars/raku-tap2codewars like Raku (Perl 6) to convert the TAP output to Codewars format.

The test is run with:

perl -I lib ./t/tests.t 2>&1 | tap2codewars

where the files are:

lib/solution.pm (file name based on package name)
lib/preloaded.pm (optional, file name based on package name)
t/tests.t

hobovsky commented 6 months ago

First of all, problem: I was not able to run the tap2codewars on my local installation of Raku.

I managed to run test suite in Perl, it generated TAP output for successful and for failed tests, so I compared these, looked at the code of tap2codewars, and I was trying to find places in code which would result in a buggy behavior considering differences in TAP of tests using is and ok. I think I found something, but it's not 100% confirmed because I was not able to run, fix, and test the tap2codewars :(

My test suite:

use List::Util qw(shuffle);
use warnings;
use strict;
use Test::Most;
use Kata;

subtest "Truthy with `is`" => sub {
    is(Kata::return_truthy(), 1, "return_truthy compared with `is`");
};

subtest "Falsy with `is`" => sub {
    is(Kata::return_falsy(), 0, "return_falsy compared with `is`");
};

subtest "Truthy with `ok`" => sub {
    ok(Kata::return_truthy(), "return_truthy compared with `ok`");
};

subtest "Falsy with `ok`" => sub {
    ok(not(Kata::return_falsy()), "return_falsy compared with `ok`");
};

done_testing();

My SUT (failing one):

package Kata;
use strict;
use warnings;

sub return_truthy {
    return 0;
}

sub return_falsy {
    return 1;
}

1;

Produced output:

# Subtest: Truthy with `is`
    not ok 1 - return_truthy compared with `is`
    #   Failed test 'return_truthy compared with `is`'
    #   at ./t/tests.t line 8.
    #          got: '0'
    #     expected: '1'
    1..1
    # Looks like you failed 1 test of 1.
not ok 1 - Truthy with `is`
#   Failed test 'Truthy with `is`'
#   at ./t/tests.t line 9.

... snip ...

# Subtest: Truthy with `ok`
    not ok 1 - return_truthy compared with `ok`
    #   Failed test 'return_truthy compared with `ok`'
    #   at ./t/tests.t line 16.
    1..1
    # Looks like you failed 1 test of 1.
not ok 3 - Truthy with `ok`
#   Failed test 'Truthy with `ok`'
#   at ./t/tests.t line 17.

... snip ...

1..4
# Looks like you failed 4 tests of 4.

In case of successful tests, there is no problem: successful tests are reported correctly, no matter if assertion is is, or ok. This part works well.

Failed tests are reported correctly when assertion is is, but they seem to be strangely truncated when the assertion is ok (see screenshots in the initial post). In produced TAP, these two are different in a way that a failing is emits additional lines with actual and expected, while the failing ok does not. Failing ok emits only test message and failing line, and nothing more.

Now, looking at the code of the reporter, there are two relevant functions:

        sub handle-comment(TAP::Comment $comment) {
            # Ignore diagnostics after subtest failure.
            # We already have the subtest name.
            return if $state == AfterSubTestFailure;

            # Not using `.comment` to avoid losing the leading spaces.
            my $content = $comment.raw.subst("# ");
            # Skip some redundant comments
            return if $state == Normal && $content ~~ /:s ^^Subtest: .+$$/;
            return if $content ~~ /:s ^^You failed \d+ tests? of \d+$$/;
            return if $content ~~ /:s ^^Looks like you failed \d+ tests? of \d+\.$$/;
            return if $content ~~ /:s ^^\s*Failed test at .+ line \d+\.?$$/;
            return if $content ~~ /:s ^^\s*Failed test \'.+$$/;
            return if $content ~~ /:s ^^\s*at .+ line \d+\.?$$/;

            @buffer.append: $content ~ "\n";
        }

        sub output-buffered() {
            if ?@buffer {
                if $state == AfterTestFailure {
                    # `indent(*)` outdents to the margin (removes common white space)
                    say "\n<FAILED::>Test Failed<:LF:>{@buffer.join.indent(*).trim-trailing.subst("\n", "<:LF:>", :g)}";
                    say "\n<COMPLETEDIN::>";
                } else {
                    say "\n{@buffer.join.trim-trailing.subst("\n", "<:LF:>", :g)}";
                }
                @buffer = ();
            }
            $state = Normal;
        }

If I read the code correctly, what happens is:

handle-comment does not find any comment (because there is no expected or actual lines, and all existing lines fall under the # Skip some redundant comments part), and leaves the @buffer empty, and as a consequence
output-buffer operates on an empty buffer, skips the if ?@buffer, and does not output <FAILED::> nor <COMPLETED::>, leaving the collapsible sections unclosed and resulting in what can be observed on CW (nested red groups with no messages).

I would work on a fix but it's difficult without possibility to get the tap2codewars running locally. What would have to be done is to convince the reporter to output <FAILED::> and <COMPLETEDIN::> even if the buffer is empty (what most probably means to remove the outer if @buffer check and compose the failure message depending on contents of the buffer).

But without possibility to run the tests I do not know what other pieces of code pass through this branch.

kazk commented 6 months ago

Sorry, I haven't had the time to look into this yet.

But without possibility to run the tests I do not know what other pieces of code pass through this branch.

Our Perl image actually uses our Raku image as base, but here's a concatenated Dockerfile you should be able to modify and experiment:

FROM alpine:3.12

RUN set -ex; \
    adduser -D codewarrior; \
    mkdir -p /workspace; \
    chown -R codewarrior:codewarrior /workspace;

RUN set -ex; \
    cd /tmp; \
    wget -q -O rakudo-pkg.apk https://github.com/nxadm/rakudo-pkg/releases/download/v2020.09/rakudo-pkg-Alpine3.12_2020.09-01_x86_64.apk; \
    apk add --no-cache --allow-untrusted rakudo-pkg.apk; \
    rm rakudo-pkg.apk;

ENV PATH=/opt/rakudo-pkg/bin:/opt/rakudo-pkg/share/perl6/site/bin:$PATH
RUN set -ex; \
    zef install https://github.com/codewars/raku-tap2codewars/archive/v0.0.6.tar.gz; \
    rm -rf /root/.raku /root/.zef;

WORKDIR /workspace

USER codewarrior
ENV USER=codewarrior \
    HOME=/home/codewarrior

# just checking if `tap2codewars` works
RUN rakudo -e 'use v6; use Test; is(1, 1); done-testing;' | tap2codewars

# Perl image starts here
# FROM qualified/raku:2020.09
# alpine:3.12 perl 5.30
USER root
RUN apk add --no-cache \
    perl \
    perl-test-most \
    perl-datetime

USER codewarrior
ENV USER=codewarrior \
    HOME=/home/codewarrior
WORKDIR /workspace

You should be able to experiment by replacing zef install https://github.com/codewars/raku-tap2codewars/archive/v0.0.6.tar.gz with zef install ./path (after copying the local tap2codewars project into the container image to path).

codewars / runner

Perl reporter does not handle failing `ok` correctly #271