Further test improvements

Fiwo735 commented 5 months ago

Improved Python testing, making it the default and recommended testing option.
Removed duplicated test case (same functionality as _example)
Updated AST diagram to the new version

Fiwo735 commented 5 months ago

@Jpnock please let me know if you have any further suggestions regarding testing. I'd also appreciate if you took a look into the Github actions as I've updated that to use the new scripts/test.py.

Jpnock commented 5 months ago

Many thanks for updating the diagram : )

I've summarised the differences between the Python and bash script below which should probably be addressed. As for the GitHub Actions issue, this is likely due to there being no terminal TTY available so these code paths need disabling in CI.

Python test script differences

[x] If make fails, the script continues to execute (there's no exit code check under both the normal and coverage paths)
[x] There's no exit code check on make clean
[x] The make clean stage doesn't use the "-C", PROJECT_LOCATION argument like all of the other make commands
[x] ASAN_OPTIONS=exitcode=0 environment variable needs to be set when executing the student's compiler binary to prevent ASAN from forcing the test to fail if there's a memory leak etc.
[x] While --silent is a nice addition, there's no output from make until it completes, so it looks like the script is stuck for about 10 seconds when you run it (I suggest a one line print to say that we're building the compiler)
[ ] The verbose output should be updated to match the output of the bash testing script to improve readability (see test_sh.log vs test_py_verbose.log)
[x] There's no timeout on the C90 compilation stage to prevent one bad test hanging the rest (previously 15s I believe)
[x] Subject to opinion but I think the verbose output should probably be the default, with a --short option to just show the progress bar (since the full output introduces students to where the appropriate log/assembly files are for each test without them having to add a flag to figure this out)

GitHub Actions issue

I assume a TTY doesn't exist in the actions runtime, so the following line likely returns a non zero exit code.

Regardless, the progress bar code should be disabled in GitHub Actions otherwise you get output that looks like test_py.log. I think we'd also want the full verbose output in the actions logs too.

    _, max_line_length = os.popen("stty size", "r").read().split()
ValueError: not enough values to unpack (expected 2, got 0)

Other notes

Have we got a plan for how we're going to update the marking code to fit in with this python script? My concern is that the marking script currently extends the existing test.sh bash script -- so if we start running this script in CI, we now have two scripts with potentially differing results for the number of seen tests passing.

At 400 LOC the Python script is much more daunting from the students' perspective to read and understand, so I suggest that we at least move the ProgressBar class to a different file.

test_sh.log

scripts/test.sh > test_sh.log

compiler_tests/_example/example.c
    > Pass

compiler_tests/array/declare_global.c
    > Failed to compile testcase: 
     ./bin/output/array/declare_global/declare_global.compiler.stderr.log 
     ./bin/output/array/declare_global/declare_global.compiler.stdout.log 
     ./bin/output/array/declare_global/declare_global.s 
     ./bin/output/array/declare_global/declare_global.s.printed

compiler_tests/array/declare_local.c
    > Failed to compile testcase: 
     ./bin/output/array/declare_local/declare_local.compiler.stderr.log 
     ./bin/output/array/declare_local/declare_local.compiler.stdout.log 
     ./bin/output/array/declare_local/declare_local.s 
     ./bin/output/array/declare_local/declare_local.s.printed

test_py.log

scripts/test.py > test_py.log

Running Tests [                                                                ]
Pass: 0 | Fail: 0 | Remaining: 87
See logs for more details (use -v for verbose output).
[3A
Running Tests [                                                                ]
Pass:  0 | Fail:  0 | Remaining: 87 
See logs for more details (use -v for verbose output).
[3A
Running Tests [[92m#[0m                                                               ]
Pass:  1 | Fail:  0 | Remaining: 86 
See logs for more details (use -v for verbose output).
[3A
Running Tests [[92m#[0m[91m#[0m                                                              ]
Pass:  1 | Fail:  1 | Remaining: 85 
See logs for more details (use -v for verbose output).
[3A
Running Tests [[92m#[0m[91m#[0m                                                              ]
Pass:  1 | Fail:  2 | Remaining: 84 
See logs for more details (use -v for verbose output).
[3A
Running Tests [[92m#[0m[91m#[0m[91m#[0m                                                             ]
Pass:  1 | Fail:  3 | Remaining: 83 
See logs for more details (use -v for verbose output).
[3A
Running Tests [[92m#[0m[91m#[0m[91m#[0m[91m#[0m                                                            ]
Pass:  1 | Fail:  4 | Remaining: 82 
See logs for more details (use -v for verbose output).
[3A
Running Tests [[92m#[0m[91m#[0m[91m#[0m[91m#[0m[91m#[0m                                                           ]
Pass:  1 | Fail:  5 | Remaining: 81 
See logs for more details (use -v for verbose output).
[3A

test_py_verbose.log

scripts/test.py -v > test_py_verbose.log`

Running Tests [                                                                ]
Pass: 0 | Fail: 0 | Remaining: 86
See logs for more details (use -v for verbose output).
[3A
Running Tests [                                                                ]
[92mPass:  0[0m | [91mFail:  0[0m | Remaining: 86 
See logs for more details (use -v for verbose output).
/workspaces/langproc-2022-cw/compiler_tests/_example/example.c
    > Pass
/workspaces/langproc-2022-cw/compiler_tests/array/declare_global.c
    > Fail: see /workspaces/langproc-2022-cw/bin/output/array/declare_global/declare_global.compiler.stderr.log and /workspaces/langproc-2022-cw/bin/output/array/declare_global/declare_global.compiler.stdout.log
/workspaces/langproc-2022-cw/compiler_tests/array/declare_local.c
    > Fail: see /workspaces/langproc-2022-cw/bin/output/array/declare_local/declare_local.compiler.stderr.log and /workspaces/langproc-2022-cw/bin/output/array/declare_local/declare_local.compiler.stdout.log
/workspaces/langproc-2022-cw/compiler_tests/array/index_constant.c
    > Fail: see /workspaces/langproc-2022-cw/bin/output/array/index_constant/index_constant.compiler.stderr.log and /workspaces/langproc-2022-cw/bin/output/array/index_constant/index_constant.compiler.stdout.log
/workspaces/langproc-2022-cw/compiler_tests/array/index_expression.c
    > Fail: see /workspaces/langproc-2022-cw/bin/output/array/index_expression/index_expression.compiler.stderr.log and /workspaces/langproc-2022-cw/bin/output/array/index_expression/index_expression.compiler.stdout.log
/workspaces/langproc-2022-cw/compiler_tests/array/index_variable.c

Fiwo735 commented 5 months ago

Is there a point of doing a return code check on make clean given it's only using rm -f which cannot fail?

Jpnock commented 4 months ago

Is there a point of doing a return code check on make clean given it's only using rm -f which cannot fail?

rm -f can fail in a few cases, notably permissions errors (e.g. if someone accidentally runs the script as sudo or root and then later runs as non root)

root@host:~# cd /tmp
root@host:/tmp# mkdir test
root@host:/tmp# cd test/
root@host:/tmp/test# touch abc

normaluser@host:/tmp$ rm -rf test/
rm: cannot remove 'test/abc': Permission denied
normaluser@host:/tmp$ echo "$?"
1

Fiwo735 commented 4 months ago

@Jpnock I believe all your suggestions have been addressed through the recent commits. Some remarks:

I have addressed, but haven't tested ASAN_OPTIONS and TTY ValueError
Splitting test.py - in my opinion, it should stay as one file as it is less intimidating that way and the program flow is not too difficult to trace for students who wish to dive deeper into Python (but noone realistically has to ever open that file for the coursework)

Jpnock commented 4 months ago

@Jpnock I believe all your suggestions have been addressed through the recent commits. Some remarks:

Thanks for the changes :) I've replied with some follow up comments which are hopefully useful

Also, just a note on your previous point: but noone realistically has to ever open that file for the coursework. Last year I was asked by quite a lot of people to walk them through the testing script, so even if nobody has to open the file, lots of people definitely did try to understand it -- so we should bear this in mind. Hiding parts away like the progress bar and junit CI test file creation in separate files seems like a good idea since you don't need to understand those at all to understand how the main part of the script works.

Fiwo735 commented 4 months ago

Thanks for the suggestions! I've left 3 comments unresolved - please have a look at them.

Re: splitting test.py - I'm still slightly more on the opinion that 1 file is less scary than 2 (or preferably even more as just throwing everything non-crucial into a helper file is not the best practice) and makes it look like there is not that much Python involved in this coursework = less intimidating. But that's not a very strong opinion, so I'm gonna check what John thinks about that.

LangProc / langproc-cw