Tudat / tudat

NOTE: This Tudat version is no longer supported. See https://docs.tudat.space/en/stable/ and https://github.com/tudat-team/tudat-bundle for the new version
BSD 3-Clause "New" or "Revised" License
87 stars 143 forks source link

Test failure: CowelStateDerivative "SIGSEGV, si_code: 0 (memory access violation at address: 0x00000000)" #136

Closed mfacchinelli closed 7 years ago

mfacchinelli commented 7 years ago

While running the tests, test 73 test_CowellStateDerivative fails. My operating system is macOS 10.12.3. The log file is attached.

LastTest.txt

DominicDirkx commented 7 years ago

It seems like this is a 'real' failure one of the two test cases in this file. All other tests seem to run fine, so I don't think its any cause for concern.

However, I would like to have a closer look to make sure. Would you have time next Thursday after/in the break of the lecture?

mfacchinelli commented 7 years ago

Ok, I'll be there. Thank you!

GigiLaan commented 7 years ago

I have the same error, operating system MacOS 10.11.6. I attached my own log file.

LastTest.txt

DominicDirkx commented 7 years ago

One more report of this in issue #140

mfacchinelli commented 7 years ago

screen shot 2017-02-23 at 14 51 01

transferorbit commented 7 years ago

For what it’s worth, I’ve just encountered exactly the same failure when running this unit test from the command line on macosx 10.12.3.

unknown location:0: fatal error in "**testCowellPopagatorKeplerCompare**": signal: SIGSEGV, si_code: 0 (memory access violation at address: 0x00000000)
/Users/kevin/tudatBundle/tudat/Tudat/Astrodynamics/Propagators/UnitTests/unitTestCowellStateDerivative.cpp:479: last checkpoint
*** 1 failure detected in test suite "Master Test Suite"
DominicDirkx commented 7 years ago

@eurospaceflight : Thanks for letting me know!

Could you do me a favor and recompile the unit test with debug symbols on: Change

CMAKE_BUILD_TYPE:STRING=Release

to

CMAKE_BUILD_TYPE:STRING=Debug

in the CMakeCache.txt file. This file should be in your build folder. Changing this will force the code to recompile all required libraries. After recompiling, could you run the debugger on the executable in the terminal:

gdb ./test_CowellStateDerivative

run

Then, after the program terminates, typing the command:

backtrace

should give a long list of function calls/line. Could you post this output here? It would help a lot in figuring out this issue. Let me know,

Cheers,

Dominic

transferorbit commented 7 years ago

Hello Dominic,

I attempted to follow the above steps as closely as possible; I did so as follows:

*** No errors detected Process 77055 exited with status = 0 (0x00000000)



- Finally, I entered the `bt` command (apparently the `backtrace` equivalent). However, that produced the following result:
```error: invalid thread```
    - I also immediately tried `thread backtrace` and `bt all`, but I received the same `error: invalid thread` message.

This is as far as I was able to get; I don’t know where to go from here. Do you have any further suggestions? 
magnific0 commented 7 years ago

@eurospaceflight, thanks for running the debugger for us. bt is just the shorthand for backtrace, bt works on gdb as well.

The reason behind the invalid threads messages is because the process exited without the segmentation fault. So if you compile the debug binaries the problem apparently goes away.

This is unfortunately common for such errors which are sensitive to the compiler.

Could you try again, but with the release binary? The output won't be as informative, but at least it's somenhing.

transferorbit commented 7 years ago

Your wish is my command line. I recompiled for Release instead of Debug and reran as follows:

Running 2 test cases... Warning, position of Mars taken as barycenter of that body's planetary system. Warning, tabulated ephemeris is being reset using data at different precision Warning, tabulated ephemeris is being reset using data at different precision Warning, tabulated ephemeris is being reset using data at different precision Warning, tabulated ephemeris is being reset using data at different precision Warning, tabulated ephemeris is being reset using data at different precision Warning, tabulated ephemeris is being reset using data at different precision Warning, tabulated ephemeris is being reset using data at different precision Warning, tabulated ephemeris is being reset using data at different precision Process 21505 stopped

DominicDirkx commented 7 years ago

@eurospaceflight Like @magnific0 said, this seems to be one of those fun little errors that goes away when you start looking for it (sort like Schrödinger's bug...). I'll run the program with valgrind on my computer, hopefully that will give us some extra information.

DominicDirkx commented 7 years ago

@eurospaceflight I have a few other ideas to try to figure out this bug. It all seems to be happening in the execution of the integration of the second unit test.

First, could you comment out the last three, last two and last of the following lines:

   testCowellPropagationOfKeplerOrbit< double, double >( );
    testCowellPropagationOfKeplerOrbit< double, long double >( );
    testCowellPropagationOfKeplerOrbit< Time, double >( );
    testCowellPropagationOfKeplerOrbit< Time, long double >( );

way at the bottom of the file, and let me know what the result is? This will let us check if the problem is with a specific combination of state scalar/time types.

Second, could you try changing:

 SingleArcDynamicsSimulator< StateScalarType, TimeType > dynamicsSimulator(
                    bodyMap, integratorSettings, propagatorSettings, true, false, true );

        Eigen::Matrix< StateScalarType, 6, 1  > initialKeplerElements =
            orbital_element_conversions::convertCartesianToKeplerianElements< StateScalarType >(
                Eigen::Matrix< StateScalarType, 6, 1  >( systemInitialState ), effectiveGravitationalParameter );

        // Compare numerical state and kepler orbit at each time step.
        boost::shared_ptr< Ephemeris > moonEphemeris = bodyMap.at( "Moon" )->getEphemeris( );
        double currentTime = initialEphemerisTime + buffer;
        while( currentTime < finalEphemerisTime - buffer )
        {

            Eigen::VectorXd stateDifference
                = ( orbital_element_conversions::convertKeplerianToCartesianElements(
                    propagateKeplerOrbit< StateScalarType >( initialKeplerElements, currentTime - initialEphemerisTime,
                                          effectiveGravitationalParameter ),
                    effectiveGravitationalParameter )
                - moonEphemeris->template getTemplatedStateFromEphemeris< StateScalarType >( currentTime ) ).
                    template cast< double >( );

            for( int i = 0; i < 3; i++ )
            {
                BOOST_CHECK_SMALL( stateDifference( i ), 1E-3 );
                BOOST_CHECK_SMALL( stateDifference( i  + 3 ), 1.0E-9 );

            }
            currentTime += 10000.0;
        }

to

    SingleArcDynamicsSimulator< StateScalarType, TimeType > dynamicsSimulator(
                    bodyMap, integratorSettings, propagatorSettings, true, false, true );

And see if it runs properly? I'd be quite surprised to see any change, but you never know. Afterwards, could change it to:

```
SingleArcDynamicsSimulator< StateScalarType, TimeType > dynamicsSimulator(
                bodyMap, integratorSettings, propagatorSettings, true, false, false );


and try again?

With a little luck, this last one will run, and give us some idea to how to fix it. The main difference in this last one is that the numerically integrated state is not used to update the ephemeris of the propagated body.

Let me know, whenever you have time, what the outcome is,

Cheers,

Dominic
DominicDirkx commented 7 years ago

There has not been any progress on this issue for some time. Also, there have been no reports from new users of this issue occurring.

@transferorbit Could you pull the development branch (tudatBundle and tudat) to see if this issue persists?

DominicDirkx commented 7 years ago

There has been no progress/report on this in about 6 months. I'm closing this issue, if it reoccurs during Mac tests of the latest code, these should be tracked in a new issue.