LLNL / lbann

Livermore Big Artificial Neural Network Toolkit
http://software.llnl.gov/lbann/
Other
223 stars 79 forks source link

Make Catch2-based testing part of developer builds #1214

Open benson31 opened 5 years ago

benson31 commented 5 years ago

Since developers are supposed to use spack and Catch2 has a spack package, we should update our recipe/environment/whatever to include the Catch2 tests. This applies to hydrogen, too, though I haven't added many tests there.

davidHysom commented 5 years ago

what is Catch2? (Yes, I know I can google. But I'm urging you (and other) to include in this and similar issues bullet points such as:

benson31 commented 5 years ago

Catch2 is the testing framework we use for a small portion of our unit tests (mostly utilities that @ndryden and I have added and a bunch of the transforms work that @ndryden did. Building with it in general is completely optional and it will always remain optional. This will just add it to the developer builds. (This support has been part of LBANN for several months now.)

LBANN desperately needs better unit tests; the Bamboo tests are good for integration testing (even the "unit tests" in Bamboo are pushing the boundary of integration testing (and really anything that uses real (i.e., non-mocked) MPI cannot be a unit test because it can fail for non-software reasons (e.g., someone unplugs some ethernet cables))) but they are painful to run and take several minutes even in the best case. Catch2 is a better choice for TDD; it integrates with CTest and Bamboo will generate nice, formatted output for your tests.

As I've mentioned before: Catch2 is already configured in LBANN and Hydrogen for sequential testing (no MPI). It can be used for MPI-based testing but there are two concerns: first, it needs a separate main() function to do the proper initialization and cleanup; second, we don't integrate with a mocking framework for MPI so the probability of false positives is higher (ethernet cables are unplugged, the internet goes away, you don't have an allocation, etc). GPUs should be fine, but again, without a mocked framework, they can give false positives for non-software reasons (e.g., a CUDA build on a system without an NVIDIA GPU (yes, this is a thing)).

As for learning, there are examples in src/utils/unit_test/ and src/transforms/unit_test/. There is also a tutorial on the Catch2 GH site and a detailed reference section. It would be great if more developers would adopt this framework since it's very flexible and easy to use. There are also lots of features that I personally want to explore but haven't had the opportunity just yet.