Testing: shell script per test

Lestropie commented 6 months ago

Committing what I've generated thus far to a draft PR; likely not able to finish off this week.

Closes #2789. Supersedes #2825.

While this took a decent amount of time to work though, I'm hoping that it will make the addition of new tests a bit more accessible: one can add header comments explaining what the test does & why, and can actually make use of newline characters.

Previous behaviour:

For every file (one per command):
- Generate initial "cleanup" test
- For each line in file:
  - Generate test string identifier based on directory (binaries / scripts / unit tests), command name (from name of file), line index
  - Create test based on shell invocation

Desired behaviour:

[ ] For every file (one per test):
- [ ] Copy test shell script into build directory
- [x] Generate test string identifier based on command subdirectory & test shell script name
- [x] Create test that invokes that shell script
  - Include trap to clean up any temporary files afterwards regardless of outcome Edit: Alternative solution for this purpose under development
- [x] Add labels to tests to indicate:
  - [x] Classification (binaries/ scripts / unit tests)
  - [x] Command being evaluated / unit test being performed
  - [x] Whether or not to include a particular script test in CI (#2825). Ideally have some string included in the comments at the head of the shell script file that can be parsed by cmake to discover that that test should have that label attributed to it.
[x] Revise CI to make use of test labels rather than regex.
Revise documentation / CI to make use of test labels rather than regex. Edit: Deferred to #2855.

github-actions[bot] commented 6 months ago

clang-tidy review says "All clean, LGTM! :+1:"

Lestropie commented 6 months ago

@daljit46: What are the chances of getting some help on the cmake side here? Suspect you'd get it done 100 times faster than I would.

daljit46 commented 6 months ago

👍 Happy to help. I'll try to have a go at this today.

daljit46 commented 6 months ago

Copy test shell script into build directory

Why do we need to copy the script in the build directory?

github-actions[bot] commented 6 months ago

clang-tidy review says "All clean, LGTM! :+1:"

daljit46 commented 6 months ago

Just made some minimal changes to current CMake scripts to run binary tests. I relied on globbing to parse the files inside each folder, but we probably best to not use it in the final implementation. For cleaning up, it's doing the same thing as the previous structure: run the clean up command before running a test. We could probably run it after the test has finished or failed, but that would be slightly more involved (still doable).

daljit46 commented 6 months ago

Ideally have some string included in the comments at the head of the shell script file that can be parsed by cmake to discover that that test should have that label attributed to it.

If we don't rely on globbing, we could do this in CMake directly. IMO, that would be cleaner.

github-actions[bot] commented 6 months ago

clang-tidy review says "All clean, LGTM! :+1:"

Lestropie commented 6 months ago

Why do we need to copy the script in the build directory?

Previously, the lines of a file were read, and each was used to form a single test invocation, where the contents of the test were just a part of the test string. Here, where an individual test can be a multi-line bash script, my thinking was that those files would therefore be copied into the build directory, and the test string would involve executing that shell script. Not saying this absolutely has to be the case, it's just what I had expected would be your recommendation.

For cleaning up, it's doing the same thing as the previous structure: run the clean up command before running a test

So my plan here is different. The set of tests for a given command are not guaranteed to be executed all together, or in any particular order. Each test needs to be possible to run in isolation, and should ideally clean up after itself. Therefore what I'd prefer in conjunction with these changes is that:

Those manually added "cleanup" jobs be removed entirely;
Each individual test use the -force option such that they can succeed even if there happens to be loose temporary data from some other test lying around;
Each test to be executed include use of trap such that any temporary files generated by that test are deleted regardless of test success or failure. (happy if there's an alternative solution here; I just want for there to be no *tmp* files lying around after running a test, rather than exclusively deleting them upon later execution of some other test)

Lestropie commented 5 months ago

Put some effort into this one as I wish to be able to make use of this change to verify and finalise changes elsewhere.

Wait to see how long the additional Python script CI checks take to execute; cull list if required. Edit: Deferring activation of Python CI tests until dev discrepancies in test results are addressed.
[x] @daljit46 Currently you've modified the cleanup such that rather than adding a cleanup "test" after the set of tests for one command, it now inserts one such cleanup "test" after every individual test. This makes the ctest terminal feedback messy as half of it is taken up by cleanup scripts. Is there no other way to manipulate ctest to clean up any intermediate files regardless of test script completion? If not, I think I'd rather manually insert a trap call at the head of every test shell script file.
File testing/README.md wasn't modified to reflect the cmake transition in #2689. @daljit46 is this something you can do? Edit: Deferred to #2855.

daljit46 commented 5 months ago

@daljit46 Currently you've modified the cleanup such that rather than adding a cleanup "test" after the set of tests for one command, it now inserts one such cleanup "test" after every individual test. This makes the ctest terminal feedback messy as half of it is taken up by cleanup scripts. Is there no other way to manipulate ctest to clean up any intermediate files regardless of test script completion? If not, I think I'd rather manually insert a trap call at the head of every test shell script file.

Yes, that's quite verbose indeed. I just pushed a change that works around this problem by adding another level of indirection. Inspired by this StackOverflow post, I've added a wrapper RunTests.cmake (not super happy with the name, so let me know if you have a better suggestion) script that calls a given bash test. This allows us to eliminate the cleanup test because we can execute the cleanup command in the CMake script instead.

File testing/README.md wasn't modified to reflect the cmake transition in https://github.com/MRtrix3/mrtrix3/pull/2689. @daljit46 is this something you can do?

Yes, that issue is tracked in #2855. I haven't completed that PR yet as we have a lot of ongoing changes to the testing logic, but I will do so when things get more stable.

Lestropie commented 5 months ago

The cleanup is for some reason not working. Have masked it with -force and explicit deletion of expected output directories on a per-test basis, but it's nevertheless not doing as intended.

daljit46 commented 5 months ago

Not exactly sure why mrconvert_coord_end is failing.

Lestropie commented 5 months ago

I'm leaving the labels in place for CI limited testing of Python scripts, but commenting out the relevant lines from the CI. Some of these require relaxation of tolerances and/or regeneration of reference data; some but not all due to the mraverageheader changes in #2702. Rather than do this regeneration just for a limited subset of the Python tests, I think it makes more sense that closer to tagging I'll do a git bisect on all of the failing tests and check there isn't anything untoward as part of #2811.

Lestropie commented 4 months ago

@daljit46 The NPY unit tests are failing on Windows here. I noticed that the relevant CI job was installing the msys version of Python rather than the UCRT version that's used for all other packages; I tried in 9d2315daf3162bfa1e7558c56a9bc6dcacb120cb switching that, and also installing the numpy dependency, but still no success. From the logs I think the relevant C++ tools are failing to execute; and on my own Windows system those tools seem to fail to execute. It might therefore be similar to #2887?

daljit46 commented 4 months ago

I think #2887 is unrelated as I can use mrview just fine on my MSYS2 installation. I will try to investigate further.

Lestropie commented 4 months ago

File testing_gen_npy issues a warning if it can't import numpy, and this doesn't appear in the CI logs.

What might be happening instead is that the corresponding binary, testing_npyread.exe in this case, isn't executing, because it's not finding the shared DLL. The two NPY tests are handled differently to the other unit tests because they're not just executing a single binary; they are instead bash scripts, each of which executes a Python testing tool and a C++ binary testing tool. I modelled the relevant CMakeLists.txt content off of the C++ binary tests. Maybe I've stuffed something up. But given "list(APPEND EXEC_DIRS ${PROJECT_BINARY_DIR}/testing/tools)" is there, I'm confused as to why it's not working. @daljit46 save me?

daljit46 commented 4 months ago

I've tried to look into this today, but my laptop just died in the middle of debugging, so I'll try later again. I don't think 59a8c00ddb79e70aaaef5f34103971bfa382c5ce is correct though, since on Windows executables look for DLLs firstly in their current location and then in any path in $PATH. Thus, we need ${PROJECT_BIN_DIR}/bin to be part of ${EXEC_DIRS} so that the test executables can find mrtrix's shared libs.

Lestropie commented 4 months ago

Ah yes you're right; thought I was being clever noting that that was a carry-over from duplicating the cmake content for C++ binary Bash tests, whereas that path wouldn't be necessary for executing unit tests since no such binaries will be executed. But the issue predates that commit.

Lestropie commented 4 months ago

I can at least confirm that outside of the ctest environment, if I manually add bin/ from my build directory into my PATH, then testing_npyread.exe works and succeeds, whereas if I don't do that then testing_npyread.exe fails silently, just as occurs in the ctest logs (both locally and in CI).

daljit46 commented 4 months ago

Ok, so I spent a lot of time figuring out why CTest is unable to execute npyread. Running ctest --verbose I can see that despite providing the exact same $PATH and arguments, running the exact same commands manually yields a different result. Either this is a bug in CTest itself or in my understanding of how execute_process is supposed to work. Either way, a clear lesson for me is that we should not use RunTest.cmake to execute the bash tests because it makes debugging things very difficult. Instead, following your idea of using trap, we can just create a run_test.sh that replaces RunTests.cmake, something like this works:

#!/usr/bin/bash

script_path=$1
working_dir=$2
additional_paths=$3

trap 'rm -rf tmp* *-tmp-*' EXIT

cd $working_dir

if [ ! -z "$additional_paths" ]; then
  export PATH=$additional_paths:$PATH
fi

bash $script_path

Then we can move cmake/BashTests.cmake to testing/cmake and call run_tests from there. I've tested this and it works just fine. I'm happy to implement this if you agree.

Additionally, I just remembered that in #2437 we decided to disable npy tests on Windows because installing numpy was tricky. Here, we will run into the same problems so I'm not even sure if we want to make the effort to make the tests work on MSYS2.

Lestropie commented 4 months ago

I was getting some rather puzzling results myself yesterday. Wasn't able to sufficiently wrap my head around exactly what was going on to be able to make modifications or even post anything of use here.

There's clearly something going very wrong with the paths, given after your changes the MSYS2 CI produced this: npyread: line 3: rm: command not found

On my own Windows system, with a clean build, I now have all tests failing...

Then we can move cmake/BashTests.cmake to testing/cmake and call run_tests from there. I've tested this and it works just fine. I'm happy to implement this if you agree.

If you think it would be a worthwhile change in its own right, I'd be happy for the change to be made. I think that for the sake of getting this PR merged to facilitate progressing with others I'll do something different, but there's clearly something more fundamental going wrong that would be preferable to resolve longer term.

Additionally, I just remembered that in https://github.com/MRtrix3/mrtrix3/pull/2437 we decided to disable npy tests on Windows because installing numpy was tricky. Here, we will run into the same problems so I'm not even sure if we want to make the effort to make the tests work on MSYS2.

I had recollection of implementing code in all relevant commands to support the prospect of numpy being absent. The read tests should create an empty directory and then read an empty directory, issuing warnings each time; the write test should have been able to write a bunch of data but then not be checked. I didn't recall doing this not just for the sake of not having tests fail on systems where numpy isn't installed, but actually for passing CI on MSYS2. These mechanisms are however not working because the C++ binaries outright refuse to execute due to not finding the MRtrix3 core DLL (despite it being in PATH).

What I'm hoping will work instead in 36f333bdbc4b033b6a2476e046a5e3680c6d069d is to not generate those tests in the first place if numpy is not available. Hopefully that can at least get CI clearing here.

Lestropie commented 4 months ago

36f333bdbc4b033b6a2476e046a5e3680c6d069d doesn't work:

NPY tests get excluded on all systems, not just MSYS2 as intended.
Just like on my local system, all C++ binary tests now fail on the MSYS2 CI.
Would be better to use find_package(Python3 REQUIRED COMPONENTS NumPy); if(${Python3_NumPy_FOUND}), present since cmake 3.14. But on my local Windows system that test passes, only for ctest to still fail.

Could do a sneaky sneaky and omit the NPY tests based on if(MINGW AND WIN32), but I'd rather not...

Lestropie commented 4 months ago

On 1f4b8bebbea3fe3d8d482433964d1014067c8030:

$ ctest -L unittest --rerun-failed --output-on-failure --verbose
500: Test command: C:\msys64\mingw64\bin\cmake.exe "-D" "BASH=/usr/bin/bash.exe" "-D" "FILE_PATH=npyread" "-D" "CLEANUP_CMD=rm -rf /home/rob/src/mrtrix3/testing/unit_tests/tmp* /home/rob/src/mrtrix3/testing/unit_tests/*-tmp-*" "-D" "WORKING_DIRECTORY=/home/rob/src/mrtrix3/testing/unit_tests" "-P" "C:/msys64/home/rob/src/mrtrix3/cmake/RunTest.cmake"
500: Working Directory: C:/msys64/home/rob/src/mrtrix3/build_release/testing/unit_tests
500: Environment variables:
500:  PATH=/home/rob/src/mrtrix3/build_release/testing/unit_tests:/home/rob/src/mrtrix3/build_release/bin:/home/rob/src/mrtrix3/build_release/testing/tools:C:\msys64\home\rob\src\dcm2niix\build\bin
500:  C:\msys64\mingw64\bin
500:  C:\msys64\usr\local\bin
500:  C:\msys64\usr\bin
500:  C:\msys64\usr\bin
500:  C:\Windows\System32
500:  C:\Windows
500:  C:\Windows\System32\Wbem
500:  C:\Windows\System32\WindowsPowerShell\v1.0;C:\msys64\usr\bin\site_perl
500:  C:\msys64\usr\bin\vendor_perl
500:  C:\msys64\usr\bin\core_perl
500: Test timeout computed to be: 10000000
500: CMake Error at C:/msys64/home/rob/src/mrtrix3/cmake/RunTest.cmake:17 (message):
500:   Test npyread failed!

MRtrix3 / mrtrix3

Testing: shell script per test #2865