Multiple Executable Unit-Tests

bassoy commented 6 years ago

Boost's unit-test framework allows to define test-suite and organize tests. Do we need to provide multiple executables for different matrix and vector unit-tests? We could specify each test-suite and test conveniently using command line arguments.

Note that because of that matrix and vector headers are included and template class instances!

bassoy commented 6 years ago

The tensor addition uses BOOST_TEST_DYN_LINK link test module dynamically for instance.

stefanseefeld commented 6 years ago

I'd prefer multiple test executables to isolate tests as much as possible. There are of course tradeoffs to be made: if most of the code is shared, we'd pay a performance penalty for compiling multiple test executables.

bassoy commented 6 years ago

Tests are very much isolated through test_suite. You can run and test each suite and test in isolation. I am not sure of the benefits of multiple test executables.

stefanseefeld commented 6 years ago

A few reasons:

compilation time
compilation failures in a single test case will prevent all test cases from the same binary to run

bassoy commented 6 years ago

compilation time

I thought just the contrary. You include the matrix and vector headers multiple times for multiple test executables. In one executable only once.

compilation failures in a single test case will prevent all test cases from the same binary to run

No, it will continue with the remaining tests and report about the failed and succeeded ones.

stefanseefeld commented 6 years ago

Right, so let me elaborate: Depending on the code layout, the total compilation time may be reduced if we only compile a single executable. However, total compilation time isn't all that counts. It might be more useful to get feedback from individual test cases as early as possible, so interleaving compilation and running test cases, rather than compile-all, then run-all, may be preferable. If all the tests are to be performed through a single test executable, and the executable can't be built because code from one specific test case fails to compile, no test case will be able to run. Using multiple test executables solves that particular issue.

bassoy commented 6 years ago

Okay, but this means that we are limiting ourselves from using the full features of the boost unit test framework because we fear that someone might push broken code to the dev branch.

Would push-requests with broken (non-tested) code be accepted for the dev branch? Is pushing in the development branch without unit-testing allowed? Isn't exactly this issue the reason for a development and master branch (with additional feature branches) with their own unit-tests and continuous integration?

stefanseefeld commented 6 years ago

I don't think the usefulness of tests should be measured exclusively in a context in which all tests pass. Quite the opposite ! :-)

Yes, ideally all tests always pass, in particular before a pull request is accepted and merged. But unit testing is an important tool to get there, not only once you have arrived. (And don't forget the scenario where initially all tests pass, then someone introduces some change that causes regressions. It's much harder to understand what change exactly causes the regression if all in a sudden the entire test suite fails, rather than just one.)

What features are we missing if we split test suites into multiple executables ?

bassoy commented 6 years ago

It's much harder to understand what change exactly causes the regression if all in a sudden the entire test suite fails, rather than just one.

Right, but if just one test fails, other tests will still run. Finding the problem depends on how you designed your tests within test suites. Note that I still have multiple cpp files, one cpp may declare one test suite, logically categorizing my unit tests. But only one executable. However, if my test suite in test_tensor_strides.cpp fails test_tensor_extents.cpp still runs.

What features are we missing if we split test suites into multiple executables ?

With boost unit test you can define multiple test suites BOOST_AUTO_TEST_SUITE that simplifies the construction of the test tree.
I can categorize unit tests with test suites and declare dependencies (see test dependencies
I can enable and disable tests suites and/or tests easily using command line options. This is so convenient when a test does not pass. You can easily disable all other tests using command line options. Now, we have to modify the Jamfile in order to disable the tests. This to me is an awful approach.
I can control the logging output for the complete framework. Just imagine you would like to specify the verbosity of the output. You would need to do it for all multiple separated unit tests.

stefanseefeld commented 6 years ago

On 2018-05-28 04:20 PM, Cem Bassoy wrote:

It's much harder to understand what change exactly causes the
regression if all in a sudden the entire test suite fails, rather
than just one.
Right, but if just one test fails, other tests will still run. Finding the problem depends on how you designed your tests within test suites. Note that I still have multiple cpp files, one cpp may declare one test suite, logically categorizing my unit tests. But only one executable. However, if my test suite in |test_tensor_strides.cpp| fails |test_tensor_extents.cpp| still runs.

If the first fails to compile the second can only run if it's part of a separate executable.

What features are we missing if we split test suites into multiple
executables ?
With boost unit test you can define multiple test suites BOOST_AUTO_TEST_SUITE that simplifies the construction of the test tree.

Yeah, I admit, getting consistency is a bit tricky when multiple layers and frameworks are involved.

I can categorize unit tests with test suites and declare dependencies (see test dependencies https://www.boost.org/doc/libs/1_67_0/libs/test/doc/html/boost_test/tests_organization/tests_dependencies.html

Ah, but there shouldn't be any dependencies among tests (just as the first sentence in the doc states) ! :-)

I can enable and disable tests suites and/or tests easily using command line options. This is so convenient when a test does not pass. You can easily disable all other tests using command line options. Now, we have to modify the Jamfile in order to disable the tests. This to me is an awful approach.

I agree, to some degree. It's aweful because Boost.Build is cumbersome. It could be much simpler and more intuitive. There are however scenarios that require the involvement of the build system, for example when you want to express that a certain chunk of code is supposed to fail to compile - which is a situation not uncommon when testing C++ template-heavy libraries. And indeed, I expect we'll want such tests at some point for Boost.uBLAS.

I can control the logging output for the complete framework. Just imagine you would like to specify the verbosity of the output. You would need to do it for all multiple separated unit tests.

Yeah.

Stefan

--

   ...ich hab' noch einen Koffer in Berlin...

bassoy commented 6 years ago

On Mon, May 28, 2018 at 10:35 PM, Stefan Seefeld notifications@github.com wrote:

On 2018-05-28 04:20 PM, Cem Bassoy wrote:

It's much harder to understand what change exactly causes the regression if all in a sudden the entire test suite fails, rather than just one.

Right, but if just one test fails, other tests will still run. Finding the problem depends on how you designed your tests within test suites. Note that I still have multiple cpp files, one cpp may declare one test suite, logically categorizing my unit tests. But only one executable. However, if my test suite in |test_tensor_strides.cpp| fails |test_tensor_extents.cpp| still runs.

If the first fails to compile the second can only run if it's part of a separate executable.

I think we are talking about two different things here. Are you concerned about the fact that someone might accidently merge or push (non-unit test) code (e.g. changes in matrix.hpp) that is not compilable? In such a case multiple test execution does not help to only some degree. For instance, test1 to test7 all include matrix.hpp. What have you gained or what is your benefit if you are able to compile test_sparse_matrix but all other matrix tests fail to compile?

If we are talking about broken unit-test source files, well then it might not necessarily affect the non unit-test code. However, isn't this exactly why we have continuous integration and deployment where we can choose the target architecture and get a feedback after a few minutes whether our pushed code compiled?

What features are we missing if we split test suites into multiple executables ?

With boost unit test you can define multiple test suites BOOST_AUTO_TEST_SUITE that simplifies the construction of the test tree.

Yeah, I admit, getting consistency is a bit tricky when multiple layers and frameworks are involved.

I can categorize unit tests with test suites and declare dependencies (see test dependencies https://www.boost.org/doc/libs/1_67_0/libs/test/doc/ html/boost_test/tests_organization/tests_dependencies.html

Ah, but there shouldn't be any dependencies among tests (just as the first sentence in the doc states) ! :-)

In some cases it makes sense and this is why Boost Unit Test allows you to explicitly set the dependencies. For instances, if the constructor of my auxiliary extent class does not work, I might want to skip the testing of the strides class as it is based on the extents class. However, this is not the strongest argument here. :-)

I can enable and disable tests suites and/or tests easily using command line options. This is so convenient when a test does not pass. You can easily disable all other tests using command line options. Now, we have to modify the Jamfile in order to disable the tests. This to me is an awful approach.

Especially when you develop test driven. In such a case you want to simply disable all other tests and just focus on your code and test-unit with a high verbosity of the unit test output.

I agree, to some degree. It's aweful because Boost.Build is cumbersome. It could be much simpler and more intuitive. There are however scenarios that require the involvement of the build system, for example when you want to express that a certain chunk of code is supposed to fail to compile - which is a situation not uncommon when testing C++ template-heavy libraries. And indeed, I expect we'll want such tests at some point for Boost.uBLAS.

So tensor tests are compiled to one executable. What about we have three to four executables that use all the nice features of boost unit test framework? One for tensor(with proxies and all operations), one for matrix (with proxies and all operations), one for vector (with proxies and all operations) and one for sparse matrices?

I can control the logging output for the complete framework. Just imagine you would like to specify the verbosity of the output. You would need to do it for all multiple separated unit tests.

Yeah.

Stefan

--

...ich hab' noch einen Koffer in Berlin...

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/BoostGSoC18/tensor/issues/9#issuecomment-392599536, or mute the thread https://github.com/notifications/unsubscribe-auth/ADzxpZtjGgm6tn31oORltptuHbz55eU2ks5t3F-NgaJpZM4UQMSh .

BoostGSoC18 / tensor

Multiple Executable Unit-Tests #9