sofa-framework / sofa

Real-time multi-physics simulation with an emphasis on medical simulation.
https://www.sofa-framework.org
GNU Lesser General Public License v2.1
871 stars 297 forks source link

[all] Add a WarningAndErrorAsTestFailure in most of the tests. #213

Closed damienmarchal closed 7 years ago

damienmarchal commented 7 years ago

Problem statement: Historically a lot of tests were implemented with the assumption that warning or error messages were tests failure. This assumption was problematic as it favor the testing of code paths that does not generate warning or errors instead of testing all the possible paths. To favor both valid and invalid paths this implicit behavior was remove and and made explicit using the ExpectMessage and MessageAsTestFailure raii's. In the process the old behavior of generating failure on errors was lost.

The PR mix the two and:

Example:

class MyTest : public Sofa_test<>

     /// This test will fail if a warning/error message is send because the test inherit from sofa_test
    void defaultTestBehavior()
    {
        ....
    }

    /// This test should generate failure if NO warning and NO Error message is emitted
    void catchingTestBehavior()
    {
        EXPECT_MSG_EMIT(Warning) ;
        EXPECT_MSG_EMIT(Error) ;

        mycomponent.load("atotallybrokenfile.txt")
    }

    /// By default test failure are catched from Sofa_test. This means that the information returned
    /// in the test failure are the location of the failure in Sofa_test.cpp:lineno
    /// This is why it is always better to explicitely specify the test behavior (possible the same as 
    /// the default one as it will report correct testname/line number information.  
    void noEmitTestBehavior()
    {
        EXPECT_MSG_NOEMIT(Warning) ;
        EXPECT_MSG_NOEMIT(Error) ;
    }

    /// Finally it is possible to narrow where the messages are expected using blocks...
    void complexTestBehavior()
    {
        /// Here is the default mode. 
        ....
        {
            /// We will report a failure is there no warning nor error emitted before the end of this block.  
            EXPECT_MSG_EMIT(Warning) ;
            EXPECT_MSG_EMIT(Error) ;
            ...
            { 
                /// But this sub-part does not count... and it shouldn't emit a message. 
                EXPECT_MSG_NOEMIT(Error) ;
                .... 
           }
        }
    }

This PR will generate a lot of new test failure... that are either real failure that deserve investigation or tests that was previously simply previously ignoring error & warning messages;

CHANGELOG for @hugtalbot & @guparan :

Reviewers will merge only if all these checks are true.

matthieu-nesme commented 7 years ago

I am not sure we want to add it manually to each test. It would be better if it could be added automatically to any Sofa_test, as it would satisfy 99% of the tests. It would be more convenient to have extra stuff only for the more complicated tests that raise an error on purpose.

damienmarchal commented 7 years ago

Hi mathieu,

This is a good question.

I recently wrote a lot test for sofa component and it appears to me that more than half of my tests cases are in fact verifying that component correctly rise errors or warnings when they are used improperly. Making these tests is very important to avoid un-detected bugs on the error handling code paths and it bring a lot to the general quality of sofa.

About the fact that the "error as test failure by default" is covering 99% of our tests... to me this is an indication that our tests are only covering half of the things they should and I'm not sure we should facilitate this behavior.

This is why I came up to making the things explicit in each test because it force the test writer to specify what are his underlying assumption and expectation for each of his tests making things very clear. In addition it put in front of his eyes what he is testing.. and what he is not testing.

PS: I really like the very detailed test failure that are reported with the explicit approach, we have either the filename & lineno of the failing test as well as the location of the message emission. eg: https://ci.inria.fr/sofa-ci/job/ubuntu_gcc-4.8_options/5020/testReport/(root)/DilateEngine_test_0/NormalBehavior/

EDIT: Your question make me think that some way to improve the things so maybe we can discuss what are the desirable features and how to expose them to the developpers. This is very hard because it is connected to lack of standardized way our code base reports errors (not exception, some uses error codes ,some uses messages). But I'm very afraid of starting such kind of very impacting discussion (I want to do UI ;)). In the meantime I suggest to use this PR.

EDIT2: what should i do with this: https://ci.inria.fr/sofa-ci/job/ubuntu_gcc-4.8_options/5020/testReport/(root)/TrianglePressureForceField_test_0/trianglePressureForceFieldTest/ ? 1) Should I ignore the warning in the test generation ?
2) Should the function be pure virtual instead ? 3) Should the warning message be a dmsg_warning instead of a msg_warning one ?

EDIT3: And this one https://ci.inria.fr/sofa-ci/job/ubuntu_gcc-4.8_options/5020/testReport/(root)/DifferenceEngine_test_0/DataTest/

damienmarchal commented 7 years ago

Looking at the source code I just find a good example.

It is from https://github.com/sofa-framework/sofa/blob/8410f29a80093f44fb0b224c91867bb7f65ee328/modules/SofaGeneralTopology/SofaGeneralTopology_test/SphereGridTopology_test.cpp In the SphereGridCreation test

Erik wanted to test both valid and invalid code path, and he is right to do so,...but as Sofa has no way to report error to the caller's he cannot detect that (I assume this is probably what he wanted to test and why he commented out line)

// EXPECT_EQ(sphereGrid2, nullptr);

With explicit message specification improve the situation as he can now write things like that:

bool SphereGridTopology_test::SphereGridCreation()
{
    // Creating a good Grid
    {
        //// This is valid code so a  warning or an error is probably a regression leading to test failure; 
        WarningAndErrorAsTestFailure err(SOURCE_LOCATION) ;

        SphereGridTopology::SPtr sphereGrid = sofa::core::objectmodel::New<SphereGridTopology>(5, 5, 5);
        EXPECT_NE(sphereGrid, nullptr);
        EXPECT_EQ(sphereGrid->d_radius.getValue(), 1.0);
    }

    // Creating a bad Grid
    {
        /// must send warn a warning message to the caller indicating something goes wrong
        /// mustn't send an error 
        ErrorAsTestFailure err(SOURCE_LOCATION) ;
        ExpectWarning      warn(SOURCE_LOCATION) ;

        SphereGridTopology::SPtr sphereGrid2 = sofa::core::objectmodel::New<SphereGridTopology>(-1, 0, 1);
    }

    return true;
}

EDIT: It just appear to me that the message rising failures are very similar in EXPECT_ /ASSERT things from gtest...nothing more, nothing less ...and similarly to EXPECT_ they are very verbose because it is what tests do :) 
Maybe I should make a macro to emphasize the similarities and showing this is part of the test framework ? 
```cpp
    EXPECT_MESSAGE( Error ) ;
    EXPECT_MESSAGE( Warning ) ;
matthieu-nesme commented 7 years ago

I do understand the idea. But I guess it should be possible to got only one component, let's call it WarningAndErrorInGTest, that would be created directly in Sofa_test (so added to every tests). This component would have two tristates {not_expected,expected,nothing}, one for warnings and one for errors. By default both states would be set to not_expected, such as by default every sofa warnings/errors would raise a gtest error. It would be enough for most of the tests, so the user has NOTHING to do when writing its tests (and it would work for any existing tests, incl. plugins').

For more complex tests like in your example, rather than creating ErrorAsTestFailure err(SOURCE_LOCATION) ; ExpectWarning warn(SOURCE_LOCATION) ; you could write something like: this->warningAndErrorInGTest.warningState = expected;

It is as explicit, not heavier, and even more flexible, as you can switch the state during the test (w/o having to delete objects and creating new ones).

I am note sure to be clear enough, do you follow me?

damienmarchal commented 7 years ago

Hi matt,

Can you have a look at my last version...the one that mimic the gtest API. It is not solving the point you are addressing but I see good things in this solution as it makes everything looking very consistant and in a test, no one complain there is a lot of EXPECT_ ... so adding few more to check the message behavior looks very fine to me. In addition it also report correctly where the problem arise instead of reporting that the problem happens in the base class.

Now about your suggestion... I like it because it is more like saying that everything that is not explicitely allowed is a failure and I really support that. I'm not yet sure how to mix the two things but we will find that ;)

PS: have you noticed that if we makes warning a failure by default we will have +30 test failure in the current sofa (which is fine to me because these failures a either showing a problem or a not precise enough test).

EDIT: I may have a drafted mix between the two approach soon.

damienmarchal commented 7 years ago

Hi matt,

I made a new version mixing you expectations and mine :) The implementation is probably drafty but I like the general API... Two aspect I like is that:

To give you a look of how it is (the V2 stuff are just transitional of course):

#include <SofaTest/Sofa_test.h>
using sofa::Sofa_test;

#include <SofaTest/TestMessageHandler.h>
using sofa::helper::logging::GtestMessageHandler ;

/// We can define a default policy for a complete class this way so that if not more
/// expectation are given this generates test failures.
class Sofa_test2 : public Sofa_test<float>
{
    EXPECT_MSG_NOEMIT_V2(Error) ;
    EXPECT_MSG_NOEMIT_V2(Warning) ;
    EXPECT_MSG_NOEMIT_V2(Deprecated) ;
};

class TestMessageHandler_test : public Sofa_test2
{
public:
    void defaultTestBehavior()
    {
        msg_deprecated("HERE") << "This should generate a failure" ;
        msg_warning("HERE") << "This should generate a failure"  ;
        msg_error("HERE") << "This should generate a failure" ;
    }

    void catchingTestBehavior()
    {
        EXPECT_MSG_EMIT_V2(Warning) ;
        EXPECT_MSG_EMIT_V2(Error) ;

        msg_warning("HERE") << "This should not generate a failure"  ;
        msg_error("HERE") << "This should not generate a test falure" ;
    }

    /// THIS TEST SHOULD FAIL.
    void expectAMessageissingBehavior()
    {
        EXPECT_MSG_EMIT_V2(Warning) ;
        EXPECT_MSG_EMIT_V2(Error) ;

        //msg_warning("HERE") << "This should not generate a failure"  ;
        //msg_error("HERE") << "This should not generate a test falure" ;
    }

    void noEmitTestBehavior()
    {
        EXPECT_MSG_NOEMIT_V2(Warning) ;
        EXPECT_MSG_NOEMIT_V2(Error) ;

        msg_warning("HERE") << "This should generate a failure but with line number close to " << __LINE__  ;
        msg_error("HERE") << "This should generate a test falure with line number close to " << __LINE__ ;
    }

    void complexTestBehavior()
    {
        {
            EXPECT_MSG_EMIT_V2(Warning) ;
            EXPECT_MSG_EMIT_V2(Error) ;

            //msg_warning("HERE") << "This should generate a failure"  ;
            //msg_error("HERE") << "This should generate a test failure" ;
            {
                EXPECT_MSG_NOEMIT_V2(Error) ;
                msg_error("HERE") << "This should generate a test failure" ;
            }
        }

        {
            EXPECT_MSG_NOEMIT_V2(Warning) ;
            EXPECT_MSG_NOEMIT_V2(Error) ;

            msg_warning("HERE") << "This should generate a failure"  ;
            msg_error("HERE") << "This should generate a test falure" ;
        }

    }
};

PS: if someone says it smell like 'asynchronous' exceptions I agree.

matthieu-nesme commented 7 years ago

I did not look at the implementation, but your example sounds really good to me! I like the scope/stacking.

With EXPECT_MSG_NOEMIT_V2(Error) ; EXPECT_MSG_NOEMIT_V2(Warning) ; EXPECT_MSG_NOEMIT_V2(Deprecated) ; by default in Sofa_test, as you well understood ;)

matthieu-nesme commented 7 years ago

Do we want to create a new PR to fix some tests, or should we start fixing them directly in this PR?

damienmarchal commented 7 years ago

Merge me I'm nice.

hugtalbot commented 7 years ago

Damn ! 23 commits, 63 files changed ! It's pretty dense, and thus hard to look into the code ! Let's discuss it at STC. We would need to find a way to highlight the key part in a PR.

damienmarchal commented 7 years ago

I think the number of commit is useless ! If you commit 55 times with things that are finally removed it looks dangerous but it is not. This tend to happen when working in the PR .

One solution is that you just "squash" everything in a second branch before making a PR. Or just ignore the commit number and squash everything while merging.

The number of file changed is a totally different issue... and is of course really meaningful and we should care about it by cutting big PR in smaller one.

matthieu-nesme commented 7 years ago

As soon as this PR is merged, we should fix the newly failing tests.

damienmarchal commented 7 years ago

Thanks @matthieu-nesme for the feedbacks and discussions.

About the implementation I agree there is some smoothing to do (there is redundency in the classes, the constructors with std::initializer are now useless as vs2013 don't like them)...

But I would be happy to have that happens in a second PR as I think I can change this without touching the API :)

damienmarchal commented 7 years ago

I cleaned the code by:

matthieu-nesme commented 7 years ago

Ok to be merged. But should not we prepare the next PR correcting most of the tests before merging this PR?

matthieu-nesme commented 7 years ago

Newly failing tests can be fixed in #237.

matthieu-nesme commented 7 years ago

Maybe warning and deprecated messages should not be seen as failure by default?

matthieu-nesme commented 7 years ago

there is no more color in the console!?

damienmarchal commented 7 years ago

About failure by default on warning & deprecated...this is a choice.

The positive thing about this 'bold' behavior is that it force tests maker to care about warning and deprecated messages otherwise they tend to ignore them. As I prefer to specify explicitely in every test case the expected behavior I dont' feel very concerned about that 'default mode'. But I think this 'bold' behavior sound nice to me and from the new failure we can see in the test reports this mode is rising interesting issue that requires investigations.

About the disable color while testing. I disable it only for the tests to see if it improve the readability. Now red in the console only means that a test have failed and it not obfuscated with the red from an expected error message. (I 'm not sure I'm clear here).

You can change that if you prefer :)

EDIT: I update the text to be more clear :)

damienmarchal commented 7 years ago

[ci-build]

matthieu-nesme commented 7 years ago

For me there is no more color, neither gtest's or sofa's.

The Fatal messages should also be considered as failures by default. Can you see why it is annoying to copy default behaviors to each test? Now we have to modify every of them rather than only a the default place... If we let deprecated and warning messages on the side for now, but we want to activate them later, it is the same story.

damienmarchal commented 7 years ago

Hi matt,

1) Color are not disabled in runSofa for me. It is only one line that disable them in Sofa_test.cpp and I don't see why it would impact something else. Do you have any idea ? 2) Fatal is already part of the default mode. 3) I removed Warning&Deprecated by default 4) There is no need to change all the tests if in the future if we want to reactivate Warning&Deprecated by default, we will just have to change few lines in Sofa_test.h. Maybe my scoping examples was not clear but the whole system is scoped per type of message so EXPECT_MSG_NOEMIT(Error) in tests have not consequence on the fact that you can add EXPECT_MSG_NOEMIT(Warning) in Sofa_test to catch all warnings.
Do you see what I mean ?

About explicit test vs default mode...
To me if the only aspect we care is how fast we can disable/re-enable tests failures then the default mode in Sofa_test is great. But is this the only aspect we care ? What about the fact that default mode is reporting useless filename&line (.../Sofa_test:93). Because of that I loose time each time I have to investigate test failures. On the contrary explicit messages specifications reports real file&line and allow to narrow the problem to very precise code location which save time. And these savings sums up each time someone is looking at tests failures.

In general I write a test once... but investigate failures several time... so I find it worth the effort of spending a bit more time on writing a good test with narrow message catching because they proved to saves time while investigating failure.

This is a totally different story with the time saving of changing the default mode by changing Sofa_test. It saves time as long as we are hesitating on what should be the default mode but after that it will slow down all failure investigations.

So... I made my best to explain why I care of explicit test but the important things to me are:

matthieu-nesme commented 7 years ago

So... I made my best to explain why I care of explicit test but the important things to me are:

  • what do I have to change so that PR can be be merged so that I can close it and move forward ?

Only one last thing I did not get. There are EXPECT_NOEMIT, EXPECT_EMIT, what about a EXPECT_NOTHING_I_DO_NOT_CARE?! That would correspond to the default behavior for warnings for instance.

  • do we have the time & courage to fix the 200 failing tests if we activate failure on Warning&Deprecated >?

We should at least fix some of them that are real crap, and must not test anything.

hugtalbot commented 7 years ago

As soon as you provide me a ChangeLog @damienmarchal I will merge it

damienmarchal commented 7 years ago

It is at the end of the PR description :)

damienmarchal commented 7 years ago

[ci-build]

damienmarchal commented 7 years ago

[ci-build]