Closed madmike200590 closed 1 year ago
Patch coverage: 71.81
% and project coverage change: -0.18
:warning:
Comparison is base (
b16b9d6
) 79.04% compared to head (5dd8f80
) 78.86%.
:umbrella: View full report at Codecov.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.
@AntoniusW this is now - finally - ready to be reviewed.
As this has been open for two months now, and all changes to the initial design discussed on #243 have been realized, I am now merging this as-is. @AntoniusW if you'd like any further changes/improvements to this feature, let me know and I'll do a follow-up PR.
Add support for unit tests to Alpha's input language
This PR adds unit tests as a feature to the ASP language interpreted by Alpha.
Unit test syntax
Test cases are defined using the directive
#test
. Every test case needs a name and some condition on the number of expected answer sets for the tested code. Furthermore, a unit test may specify some input data (i.e. facts) as "setup code". The actual verification happens in assertion blocks. The following grammar snippet formally defines the syntax of unit tests:The following example program shows a very simple test case using one universal assertion:
Evaluation of Unit Tests in Alpha
Tests are evaluated by Alpha according to the pseudocode algorithm below:
Program components, intermediate results, and functions:
U
: The tested unit, i.e. the "regular" ASP code Alpha is called withTC
: One Unit-Test, consisting of Test-InputTI
and assertionsA_1
throughA_n
TP
: The program resulting from adding test input facts (TI
) to program under testU
TA
: The answer sets ofTP
size(ASS)
: returns the number of answer sets in the setASS
CS
: The condition on the number of answer sets ofTP
(unsat
or comparison operator over psoitive integer)not_satisfies(size, expected)
: true iff the integersize
does not conform to the given satisfiability condition 'CS'P(AS)
: The program resulting from transforming the atoms in an answer set to a program consisting of facts onlyTR
: The answer set of the program resulting from solvingP(AS) union verifier(A)
, i.e. the verifier of assertion A with added factsP(AS)
assertionError(AS)
: checks if an answer setAS
contains assertion errors, i.e.assertionError(AS)
is true iffAS
contains atoms of any predicate with nameassertion_error
of any arityfail(T)
lets the testT
failpass(T)
lets the testT
passIntuitively, the test input
TI
of a testTC
is added to the program under testU
and solved. If the number of answer sets does not conform to the expectation of the test, the test fails. Assertions are evaluated by transforming every answer set of the program into facts and combine it with the verification program associated with the assertion. The assertion succeeds if the resulting program is satisfiable. This needs to hold for all answer sets in case of a universal assertion (assertForAll
) or at least one answer set for an existential assertion (assertForSome
).Evaluation Example
In order to illustrate how unit tests are evaluated, consider the following typical encoding of the graph 3-coloring problem:
Assume we want to verify that our "directed edge conversion rule"
edge(Y, X) :- edge(X, Y).
works as expected, i.e. for any edge from vertexa
tob
, we also get the reverse edge fromb
toa
. This can be accomplished using following test case:To understand how Alpha evaluates this test, let's first look at the
given
block: We specify a simple triangular graph with verticesa
,b
andc
and directed edges(a, b)
,(b, c)
and(c, a)
. The facts from this block are combined with the program itself, i.e. the actual test program with respect to which the test case is evaluated is:Solving this program in Alpha yields 6 answer sets:
The satisfiability condition of test case
no_asymmetric_edge
isexpect: >0
- since 6 is greater than zero, the condition holds.Next, the universal asssertion
assertForAll { :- edge(A, B), not edge(B, A).}
is verified. It uses one constraint to express that in no answer set there must be an edge without a reverse edge. Assertions are verified by converting each answer set to facts, combining it with the assertion code, and solving the resulting program. For example, in case of answer set 1 from above, this would yield:Solving this program yields one answer set:
It's easy to see that similarly, the resulting programs for all six answer sets will be satisfiable, thereby satisfying the universal assertion that in no answer set there must be an edge without a reverse edge.
Existential assertions work similarly, but instead of requiring the assertion to hold in all answer sets, it just needs to hold in at least one answer set in order for the assertion to be satisfied.
Notes on design and intended use
Assertion Encoding
Since the verifier associated with an assertion can be any kind of ASP program, one could put choice rules, directives, or virtually anything else there. However, the intended (and reasonable) usage is to stick to one or a few constraints, potentially with a few simple helper rules
Scope of a test
Implementation-wise the program under test is all the input Alpha is called with (potentially multiple files along with command-line-supplied strings), however, I propose the following guidelines for "good style when writing unit tests":
Commandline-interface for unit tests
By default, a commandline call like
java -jar Alpha.jar -i someFile.asp
will not run any unit tests contained insomeFile.asp
. The idea is that in most real-world uses there would be more than one input file which would make it necessary to disable tests using some flag most of the time. To run unit tests onsomeFile.asp
, the command isjava -jar Alpha.jar --run-tests -i someFile.asp
.Full test suite example
Building on the graph-coloring example from above, the following version of the graph coloring program is enriched with a comprehensive suite of unit tests covering most of the implementation using both universal and existential assertions: