Testplan system cleanup

Yarn Spinner has a bunch of different tests and testing systems, one of these I am calling the Testplan System. The Testplan system is essentially an integration test and they form the bulk of the tests in the repo. This system works really well, the core idea of "make a yarn file, make a test plan file for it, and if they don't match the test fails" has worked quite well in my opinion. It is very easy to use and allows for rapidly adding tests without the need to write any C# testing code for the most part. But this process grew organically and features got added or changed on the fly as needed to make things work, as such not only is the "best" way to use it unclear, any way beyond the basics is unclear. As we expand the scale and platforms of Yarn, I suggest that we clean up the testing cases, make clear their assumptions, and document the test plan so that the way we use them is consistent and easily followable and portable by others.

This involves needing to look at a few different things:

document the test plan format
output rules
compilation tests
runtime error tests
assert function and custom functions
hard to testplan functionality

Document Test Plan Format

The test plan files, .testplan, are a text file that shows the expected output of a running a single yarn file as well as allowing for controlling that output such as selecting which option to choose. This system is entirely undocumented, which makes it hard to port tests and hard for new users to make new tests for any features they create. This is a first attempt to document it by explaining my understanding of the format. The core of the test plan is each line starts with a keyword that determines what the expected action from the Yarn system is to be. The remainder of the line in the test plan has special functionality depending on what the keyword was.

If the test plan and the results sent over by the yarn system disagree there is a mismatch and it is considered an error and the test has failed at that mismatch.

Keywords

Each keyword is all lowercase and ends with a : symbol to delineate the rest of the line. The following are the keywords:

line
command
option
select
stop

line

line indicates that the next element the Yarn system sends over is to be a line. The remainder of the line in the test plan shows an exact match of what the line should be. This is after any values interpolation, or formatting.

command

command indicates that the next element the Yarn system sends over is to be a command. The remainder of the line in the test plan shows an exact match of the the command after value interpolation. The command is to be compared as a string and not a series of individual components.

option

option indicates that the next element the Yarn system sends over is to be an option. The remainder of the line in the test plan shows an exact match of the line component of the option, this is to be after any value interpolation or formatting. Because generally options are bundled together it is common to see multiple options one after another in a test plan file. Options that end with [disabled] are options with a conditional statement that evaluated to false.

select

select is an action, this is to let the testing system tell the Yarn system which option to select. The remainder of the line is the index of the option to be selected. The index is one-indexed, not zero-indexed, so the first option in the list is at index 1.

stop

stop is an action, this is to say there is no more to occur in the test plan, this is optional and the end of the file works the same. The stop unlike the rest has no other component.

Comments and Empty lines

If a line begins with a # symbol the entire line is a comment and is to be ignored. If a line is empty, the entire is to be ignored.

Unclear elements

The following are elements of the test plan file and its rules that are unclear or ambiguous to me:

Are comments allowed anywhere in the test plan or just at the start of the line?
Is there a must-be-enabled equivalent to the [disabled] flag on options?
Does the select index map to the options as sent over or does it match to an option id?
What purpose does stop play?
Does stop need to be written as stop: or does it not require the :?
What happens if more elements are received after a stop or the end of the test plan?

Output rules

For the most part the assumption made by Yarn Spinner is line providers will take over any value formatting for interpolation is the right one, but when building a testing system these assumptions need to be made explicit. How should values be formatted when interpolated into a line, option, or command?

In particular what rounding and precision should numbers use, and are integers to be treated as an integer or converted into a decimal number, or vice-versa? What spelling and capitalisation should the values true and false have?

Take the following example:

<<set $a = 1>>
<<set $b = 1.0>>
<<set $c = true>>

{$a}, {$b}, {$c}

Which of the following in the test plan is correct, are any of them?

line: 1, 1.0, true
line: 1.0, 1.0, true
line: 1, 1.0, True
line: 1.0, 1.0, True
line: 1, 1, true
line: 1, 1, True

Compilation tests

There are a few tests that I am calling compilation tests, they are tests which have an empty testplan file associated with them. Either the testing is done deeper inside C# or they are just testing that they compile at all, often checking syntax rules. Likewise we have the opposite, yarn files without a testplan which means they are not meant to compile and if they do this is an error.

Personally both of these approaches are weird and confusing, checking if compilation does or doesn't work is not something that testplan's are a good fit for. Some of these may tie into the the hard to testplan functionality section below but if a yarn file goes into the testplan system it should have a testplan file with it and the testplan file should never be empty, if something doesn't compile that should be an immediate failure of the test. Compilation tests are not something that I feel the testplan system can do a good job of, it just does not have the granuality required.

Runtime error tests

Somewhat related to the above there is no way to test for runtime errors. As testplans are testing essentially the entirety of the system it makes sense to be able to say "there will be an error now" otherwise error handling cannot be tested this way. Is this desirable? This also allows for some extra levels that means we can somewhat support compilation tests on larger projects if we have something like error: compile and error: runtime or something to that affect.

Assert and other custom functions

Some yarn files and their associated test plans use and rely on custom functions, which is going to be unavoidable, but for the most part these should be kept to a minimum. I also feel they should be mentioned in the comments at the top of the testplan and yarn file so that they can be reproduced if needed.

A great many of the tests however use a custom function called assert which takes in an expression and checks that that expression evaluates to true. I dislike this, it grew out of the test plan system over time and no longer serves a purpose in my mind. It adds an additional layer to test something that the testplans themselves can already test, forcing two testing systems into the one test. For the most part these can be straight up replaced with a line: true in the testplan and {expression to be evaluated} in the yarn.

Hard to testplan

Some things are just plain hard to test using the test plan, in some cases these are fine because not everything needs to be run through the same testing system. But some things are difficult due to the limits of the testing system, in particular there are two limitations that restrict it being used to automate test larger projects.

In particular all testplans are associated with a single yarn file, and all start at the Start node. It should be easy to allow a testplan to work across multiple yarn files and allow the testplan to say which node to launch. This could be done by adding a run: nodename action into the testplan that launches that node, that way testplans could be used to test entire projects at once.

YarnSpinnerTool / YarnSpinner