Create automated test suites

Bananattack commented 5 years ago

As more development work happens on Wiz, it will eventually become useful to have an automated test suite that can ensure that no regressions have been introduced, and to ensure new features meet their requirements and work as documented. Also, as bugs are reported, it would be possible to add new tests to cover any problems that get fixed.

Being just one core developer, writing test harnesses can take a lot of time, and this is why it hasn't happened yet. However, I see the value in automated unit tests of various systems in the compiler, and functional tests to ensure the compiler produces the correct outputs given some inputs.

Being able to test that the compiler works as intended as it is developed will save time spotting regressions and debugging problems, and makes validation of internal commits and outside contributions easier.

Ideally it'd be cool to have a folder of test programs that could assert an exact output binary gets generated as a result.

Also, a way to validate type-system / intermediate detail stuff within the compiler.

undisbeliever commented 5 years ago

If you don't mind, I could probably write a simple test suite for you. If your happy with this design, please let me know and I get to work on it sometime during the weekend.

We need to have three types of tests; error tests, block tests and assert tests.

This is an error test I use for the untech-engine memory-management macros. It consists of a // ERROR tag that tells a python script where the compiler should fail. The test passes if the compiler does not return EXIT_SUCCESS and correctly displays the failed line number.

It would not be very difficult to modify the script to work with wiz.

// Try and create two Data blocks that overlap each other

define MEMORY_MAP = HIROM
define ROM_SIZE = 1

include "../../../src/common/memory.inc"

createDataBlock(rom0,  0xc00000, 0xc000ff)
createDataBlock(rom1,  0xc00100, 0xc001ff)
createDataBlock(rom2,  0xc00100, 0xc0017f)  // ERROR
createDataBlock(rom3,  0xc10000, 0xc1ffff)

The block tests could be created by a similar design, a hexdump snippet of the expected output could be added to the top of the test file. The script would invoke wiz with the correct platform (or platforms if the code could work on multiple systems) and test if a subset of the output matches.

Only a subset of the output would be tested. There is no need to test the ROM header as it could change between wiz releases.

For example:

// PLATFORM sfc
// OUTPUT 000000  c2 20 1a 3a e2 20 1a 3a 80 fe

bank code     @ 0x808000 : [constdata;  0x8000];

in code {
const @ 0x80FFFC = reset; // emu reset

#[fallthrough]
func reset() {
    mem16();
    #[mem16] {
        aa++;
        aa--;
    }

    mem8();
    #[mem8] {
        a++;
        a--;
    }

    while true { };
}
}

The third type of test is an assert test and would be used to verify memory addressing and expressions are valid.

These tests would require an assert statement to be added to wiz.

Bananattack commented 5 years ago

I like this plan! Definitely like the idea of having these different kinds of automated tests, and I appreciate the help. The first two sound like they could be done completely black-box/"functional testing" style with external scripts.

As for the asserts, I guess that depends what kind of things it would need to be able to do.

If it's a runtime assert, that would be a bit tricky to accomplish, and might be better done as a library function per platform.

If you mean an assert that happens during compilation, then it's a bit easier to do. I would probably avoid naming it assert directly and try to name it static assert or something like that. (just so a runtime assert function could be defined if the user writes one) Otherwise I am 100% on-board with this existing, since it's useful for verifying different stuff.

Depending on when the asserts are get checked, they could be used to verify link-time stuff like label positions, but also compile-time checks that happen in earlier passes.

If we need access to compiler internals in the static assertions, that might be hard... there is an outstanding task to refactor the compiler code so it's more modular parts, and easier to navigate. Wiz's compiler.cpp kinda blew up in size as it was being worked on, because everything was focused on just getting something working first.

But if we were to make incremental changes to the compiler code in favor of better test coverage, it might be possible to get there. Some internal tests, like instruction selection and expression folding and stuff would be great if they didn't have to make a round trip through the entire compiler, but could be done as smaller unit tests.

Anyways this got pretty rambly and that's a sign I should probably go to bed, hahah. Either way, even just the functional tests would be a great start, and we could make steps toward the others.

undisbeliever commented 5 years ago

Yeah I was talking about static (compile time) asserts, I don't think we need to expose wiz internals to test wiz.

Adding static assert to wiz would allow me to write tests that ensure

wiz handles expressions correctly (ie, static_assert 5 + 3 * 10 == 35)
memory addresses of fields are correct (ie, static_assert &array_of_structs[4].field == &array_of_structs + (4 * 10 + 2))

These kinds of tests could be written using the block test format, but using static_assert would be neater and easier to understand.

Bananattack commented 5 years ago

Oh okay! That makes sense. This whole idea sounds pretty good, go for it!

It'll be really great to have this. Let me know if you need help with anything.

undisbeliever commented 5 years ago

Just to let you know, I am still working on this, it's just slow and demotivating. Currently I'm doing a bit at a time every day or so.

I have just finished writing the block test for the 6502 core, occasionally checking that the right instructions are emitted using a hex editor. I still need to run the output through a disassembler, verify the output, and the write the block-test python script.

So there is progress, just not as much as I hoped.

Bananattack commented 5 years ago

Totally understandable, this kind of stuff is a fairly large undertaking! Still really appreciate the help. (As much as I like working on improving these tools, it's way more enjoyable putting together new game ideas and demos heheh.)

undisbeliever commented 5 years ago

I've committed the 6502 block test to my wiz fork, please let me know what you think. Please don't merge it in yet, I want to finish the 65c02 tests and write some failure tests before that happens.

The tests were written by compiling wiz code, disassembling it with radare2, confirming it's valid and then placing the disassembly in the source code.

The tests have caught a few bugs, I'm writing up the bug reports now.

There are two features that can make testing wiz easier.

1/ The ability to set a variable depending on the target system, so I can reuse the 6502 tests to test the huc2680

#if system == huc6502
  ZERO_PAGE_BANK = 0x2000
#else 
  ZERO_PAGE_BANK = 0x0000
#endif

2/ The ability to reuse 6502 wiz code when the target is the wdc65816.

Currently we get the following error:

> ../bin/wiz --system wdc65816 -o obj/6502_alu.65816.bin block/6502_alu.wiz
* wiz: version 0.1.2 (alpha)
>> Parsing...
>> Compiling...
block/6502_alu.wiz:25: error: could not generate code for assignment `=`
block/6502_alu.wiz:25: note: got: `a = 12`
block/6502_alu.wiz:25: note: possible options:
  `a = {0..255}`
  `a = *({0..255} as *u8)`
  `a = *(({0..255} + x) as *u8)`
  ...
block/6502_alu.wiz:25: note: assignment must be rewritten some other way

I'm thinking we could allow mem8 and idx8 attributes in the 6502 targets and the dev could mark wiz code a 6502 and 65816 compatible by adding

#[mem8, idx8]
in prg {
  <code>
}

to the top of the file.

Bananattack commented 5 years ago

Wow, this is off to a great start!

Thanks for all of these reports so far, I will try to dig into this more soon. I just got a working desktop at home again the other day, so I'm hopeful to get more development time soon.

1/ There is an experimental way to put attributes in front of statements via #[compile_if(condition)] stmt; but the interface for doing this is really not great after all... originally wanted to something kind of like Rust cfg attributes, but the syntax is overly cumbersome, and Rust-style attributes cannot have an else clause. I think it'd make more sense to add C-style pre-processor syntax, but I might restrict it somewhat like C#, so that any defines must be strictly true/false expressions, and no macros or weird token pasting stuff, rather than the full mess that is the #define expansion rules in C.

If you want to use what's there in the meantime. Inside of a compile_if, any compile-time bool expression is possible. Normally constants cannot be optionally defined, so there was a special facility for checking special 'define' values which are global values passed to the program. The __has_def("name") or __get_def("name", fallback) expressions let you access these defines. I think there might be a command-line feature for passing user-specific defines, too, didn't look at that though.

The syntax is like:

#[compile_if(__has_def("__cpu_6502"))] {
   ...
}

#[compile_if(__has_def("__cpu_6502"))] statement;

There's CPU defines for stuff like this:

__cpu_6502
__cpu_65c02
__cpu_rockwell65c02
__cpu_wdc65c02
__cpu_huc6280
__cpu_wdc65816
__cpu_z80
__cpu_gb
__cpu_spc700

It looks like there's "family" defines too, but I spot typos in these so I don't recommend these.

Anyway, blech. I want to rip that out and start over. If these are useful for tests in the meantime, feel free to use this feature though.

The other issue with attributes is that this is only available at the statement level, so if we want conditionally compiled sub-expressions or conditionally-added attributes, or other things, we can't do that. This might be fine, but C-style #if and #ifdef definitely have an advantage in being able to filter out lines anywhere in the middle of parsing, not just excluding statements during the semantic pass over the AST.

2/ yeah that could be possible as a quick method of supporting the same code on 6502 and 65816, 6502 could have only the mem8/idx8 attributes for compatibility. The 6502 doesn't need to do anything with those flags, it just needs to make those exist.

Also, if a preprocessor existed, this kind of thing would be easier.

Bananattack commented 5 years ago

Sorry, I still haven't got a chance to sit down with this! I'd like to take a look but I need some time cleared off to focus on this. The past couple weeks have been hard to much once I got home from work, and time on my weekends has been fairly divided. I am hoping I get back my energy again soon. I am going on a small vacation in about a week, maybe I will have a chance during then to review this. I still really appreciate this effort you've put towards writing tests, and I would like to integrate your work as soon as I have the chance to properly review things.

undisbeliever commented 5 years ago

That's OK, I haven't done much work on the tests either.

undisbeliever commented 5 years ago

I've added some failure tests to my wiz fork, again please let me know what you think.

I've also caught a few more bugs and filed the appropriate issue reports.

Bananattack commented 5 years ago

Gradually reviewing these. Good catches!! I merged your tests upstream, and I can try and address the issues you've reported and make those go green! Slowly crawling back into doing dev stuff now on vacation

wiz-lang / wiz

Create automated test suites #50