KratosMultiphysics / Kratos

Kratos Multiphysics (A.K.A Kratos) is a framework for building parallel multi-disciplinary simulation software. Modularity, extensibility and HPC are the main objectives. Kratos has BSD license and is written in C++ with extensive Python interface.
https://kratosmultiphysics.github.io/Kratos/
Other
1.03k stars 245 forks source link

Non-deterministic tests #1121

Closed oberbichler closed 6 years ago

oberbichler commented 6 years ago

Hi! When I run the tests from the StructuralMechanicsApplication under Windows I get different results without changing anything. For example test_patch_test_large_strain.py:

2017-12-04 15_38_37-eingabeaufforderung 2017-12-04 15_38_43-eingabeaufforderung 2017-12-04 15_38_51-eingabeaufforderung 2017-12-04 15_38_19-

loumalouomega commented 6 years ago

Please, can you try to run in FullDebug?

philbucher commented 6 years ago

@oberbichler please try in FullDebug I tired several times in release and in fulldebug last week and I didn't get these errors. In fact the solids always worked for me

oberbichler commented 6 years ago

It is very strange because I get the same error on two computers. Both with VS2017.4, Boost 1.65.1 and Python 3.6. In FullDebug I get a memory leak and runkratos.exe always crashes.

loumalouomega commented 6 years ago

The error just happens in that configurations, in Linux or other configurations doesn't happen?

oberbichler commented 6 years ago

Anyone else has this problem and I think I am the only one here who uses Windows in this configuration.

The analysis works well.

oberbichler commented 6 years ago

Attaching the debugger it seems that boost::python has some trouble with string arguments. But I am not sure if the setup is correct. Does anyone debug Kratos under Windows?

RiccardoRossi commented 6 years ago

Hi @oberbichler, could it be an incompatibility between compiler versions ABI? i mean, something like python having been compiled with another version of visual studio then the boost_python and the kratos, or anything of the sort...

pooyan-dadvand commented 6 years ago

Have you pass it to valgrind by running the same test in the linux with the same boost? Sometimes a memory error only appears in some machines and not others.

oberbichler commented 6 years ago

Everything is compiled with the same VS2017. Same bug on a different machine with same configuration.

When I run the test under Linux everything works fine (same machine, Fedora 27, gcc 7.2.1, boost 1.65.1, python 3.6).

Maybe I missed something...

RiccardoRossi commented 6 years ago

but you did not compile python yourself right?

Riccardo

On Thu, Dec 14, 2017 at 10:00 AM, thomas notifications@github.com wrote:

Everything is compiled with the same VS2017. Same bug on a different machine with same configuration.

When I run the test under Linux everything works fine (same machine, Fedora 27, gcc 7.2.1, boost 1.65.1, python 3.6).

Maybe I missed something...

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/KratosMultiphysics/Kratos/issues/1121#issuecomment-351649090, or mute the thread https://github.com/notifications/unsubscribe-auth/AHr7EaV4HUffJY6z6LlQYf6N23dLeQqUks5tAOO1gaJpZM4Q0wz9 .

--

Riccardo Rossi

PhD, Civil Engineer

member of the Kratos Team: www.cimne.com/kratos

Tenure Track Lecturer at Universitat Politècnica de Catalunya, BarcelonaTech (UPC)

Full Research Professor at International Center for Numerical Methods in Engineering (CIMNE)

C/ Gran Capità, s/n, Campus Nord UPC, Ed. B0, Despatx 102

(please deliver post to the porters of building C1)

08034 – Barcelona – Spain – www.cimne.com -

T.(+34) 93 401 56 96 skype: rougered4

http://www.cimne.com/

https://www.facebook.com/cimne http://blog.cimne.com/ http://vimeo.com/cimne http://www.youtube.com/user/CIMNEvideos http://www.linkedin.com/company/cimne https://twitter.com/cimne

Les dades personals contingudes en aquest missatge són tractades amb la finalitat de mantenir el contacte professional entre CIMNE i voste. Podra exercir els drets d'accés, rectificació, cancel·lació i oposició, dirigint-se a cimne@cimne.upc.edu. La utilització de la seva adreça de correu electronic per part de CIMNE queda subjecte a les disposicions de la Llei 34/2002, de Serveis de la Societat de la Informació i el Comerç Electronic.

Imprimiu aquest missatge, només si és estrictament necessari. http://www.cimne.com/

oberbichler commented 6 years ago

No, I used miniconda and the "classical" python.

pooyan-dadvand commented 6 years ago

When I run the test under Linux everything works fine (same machine, Fedora 27, gcc 7.2.1, boost 1.65.1, python 3.6).

I understand that the tests run. My question was if valgrind has any report related to Kratos? (ignoring the python ones)

RiccardoRossi commented 6 years ago

clsoing this has been silent for a while (since december). please reopen if it is still active

philbucher commented 6 years ago

guys I don't like being the bad guy bringing this up again but I also have rare random failures

test_UL_3D_hexa (test_patch_test_large_strain.TestPatchTestLargeStrain) ... terminate called after throwing an instance of 'Kratos::Exception'
  what():  Error: WARNING:: ELEMENT ID: 2 INVERTED. DETJ0: -2.20488e+08

in /home/philippb/software/Kratos_dev/applications/StructuralMechanicsApplication/custom_elements/updated_lagrangian.cpp:272:virtual void UpdatedLagrangian::CalculateKinematicVariables(BaseSolidElement::KinematicVariables&, unsigned int, const IntegrationPointsArrayType&)
   /home/philippb/software/Kratos_dev/applications/StructuralMechanicsApplication/custom_elements/updated_lagrangian.cpp:249:virtual void UpdatedLagrangian::CalculateAll(Element::MatrixType&, Element::VectorType&, ProcessInfo&, bool, bool)
   kratos/solving_strategies/schemes/residualbased_incrementalupdate_static_scheme.h:291:void ResidualBasedIncrementalUpdateStaticScheme<TSparseSpace, TDenseSpace>::CalculateSystemContributions(Element::Pointer, ResidualBasedIncrementalUpdateStaticScheme<TSparseSpace, TDenseSpace>::LocalSystemMatrixType&, ResidualBasedIncrementalUpdateStaticScheme<TSparseSpace, TDenseSpace>::LocalSystemVectorType&, Element::EquationIdVectorType&, ProcessInfo&) [with TSparseSpace = UblasSpace<double, boost::numeric::ublas::compressed_matrix<...>, boost::numeric::ublas::vector<double> >; TDenseSpace = UblasSpace<double, boost::numeric::ublas::matrix<double>, boost::numeric::ublas::vector<double> >; Element::Pointer = boost::shared_ptr<Element>; ResidualBasedIncrementalUpdateStaticScheme<TSparseSpace, TDenseSpace>::LocalSystemMatrixType = boost::numeric::ublas::matrix<double>; ResidualBasedIncrementalUpdateStaticScheme<TSparseSpace, TDenseSpace>::LocalSystemVectorType = boost::numeric::ublas::vector<double>; Element::EquationIdVectorType = vector<long unsigned int>]

Aborted (core dumped)

or

======================================================================
FAIL: test_TL_3D_hexa (test_patch_test_large_strain.TestPatchTestLargeStrain)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/philippb/software/Kratos_dev/applications/StructuralMechanicsApplication/tests/test_patch_test_large_strain.py", line 512, in test_TL_3D_hexa
    self._check_results(mp,A,b)
  File "/home/philippb/software/Kratos_dev/applications/StructuralMechanicsApplication/tests/test_patch_test_large_strain.py", line 181, in _check_results
    self.assertAlmostEqual(d[2], u[2])
AssertionError: 46821656707.80075 != 46821656707.80076 within 7 places

Ubuntu 16.04, GCC 5.4, Release, couldn't reproduce in FullDebug yet

It happens ~ every 10th run

loumalouomega commented 6 years ago

Uuuuuuuum, no idea why

pooyan-dadvand commented 6 years ago

Has all the symptoms of a memory leak... A valgrind pass in full debug would help

oberbichler commented 6 years ago

Btw: I never found a solution for my problem. But the analysis runs fine...

philbucher commented 6 years ago

I will try to get the valgrind output

RiccardoRossi commented 6 years ago

the bbar test is fixed in #1629

marandra commented 6 years ago

there are the similar errors in small strain test (fixed now) and large strain tests. I will fix it and push it

philbucher commented 6 years ago

I think the errors are fixed in #1629 , so I will close this and reopen in case it reappears

marandra commented 6 years ago

One question, I see a lot of verbosity when running test_StructuralMechanicsApplication.py. Is it expected?

philbucher commented 6 years ago

you can select the level of verbosity (see here) e.g. python3 test_StructuralMechanicsApplication.py -v0 for muting it Could you still post the output?

marandra commented 6 years ago

The output is like that, and it goes on and on. May it be due that I am in Debug mode?

15:42:29 ~/.../StructuralMechanicsApplication/tests$ python3 test_StructuralMechanicsApplication.py 
 |  /           |             
 ' /   __| _` | __|  _ \   __|
 . \  |   (   | |   (   |\__ \ 
_|\_\_|  \__,_|\__|\___/ ____/
           Multi-Physics 5.3.0-c9fd4c3760-Debug
Importing    KratosStructuralMechanicsApplication
     KRATOS   ___|  |                   |                   |                     
            \___ \  __|  __| |   |  __| __| |   |  __| _` | |                   
                  | |   |    |   | (    |   |   | |   (   | |                     
            _____/ \__|_|   \__,_|\___|\__|\__,_|_|  \__,_|_| MECHANICS     
Importing    KratosExternalSolversApplication
Initializing KratosExternalSolversApplication... 
FEASTSolver solver is not included in the compilation of the External Solvers Application
..............  [Reading Nodes    : 8 nodes read]
  [Reading Elements : 5 elements read] [Type: SmallDisplacementElement2D4N]
  [Reading Conditions : 1 conditions read] [Type: LineLoadCondition2D2N]
  [Reading Conditions : 1 conditions read] [Type: LineLoadCondition2D2N]
  [Reading Conditions : 1 conditions read] [Type: LineLoadCondition2D2N]
  [Reading Conditions : 1 conditions read] [Type: LineLoadCondition2D2N]
  [Total Lines Read : 133]
.  [Reading Nodes    : 8 nodes read]
  [Reading Elements : 10 elements read] [Type: SmallDisplacementElement2D3N]
  [Reading Conditions : 1 conditions read] [Type: LineLoadCondition2D2N]
  [Reading Conditions : 1 conditions read] [Type: LineLoadCondition2D2N]
  [Reading Conditions : 1 conditions read] [Type: LineLoadCondition2D2N]
  [Reading Conditions : 1 conditions read] [Type: LineLoadCondition2D2N]
  [Total Lines Read : 143]
.  [Reading Nodes    : 8 nodes read]
  [Reading Elements : 5 elements read] [Type: SmallDisplacementElement2D4N]
  [Reading Conditions : 1 conditions read] [Type: LineLoadCondition2D2N]
  [Total Lines Read : 88]
.  [Reading Nodes    : 8 nodes read]
  [Reading Elements : 10 elements read] [Type: SmallDisplacementElement2D3N]
  [Reading Conditions : 1 conditions read] [Type: LineLoadCondition2D2N]
  [Total Lines Read : 98]
.  [Reading Nodes    : 16 nodes read]
  [Reading Elements : 5 elements read] [Type: SmallDisplacementElement3D8N]
  [Reading Conditions : 1 conditions read] [Type: SurfaceLoadCondition3D4N]
  [Reading Conditions : 1 conditions read] [Type: SurfaceLoadCondition3D4N]
  [Reading Conditions : 1 conditions read] [Type: SurfaceLoadCondition3D4N]
  [Reading Conditions : 1 conditions read] [Type: SurfaceLoadCondition3D4N]
  [Total Lines Read : 182]
.  [Reading Nodes    : 16 nodes read]
  [Reading Elements : 30 elements read] [Type: SmallDisplacementElement3D4N]
  [Reading Conditions : 2 conditions read] [Type: SurfaceLoadCondition3D3N]
  [Reading Conditions : 2 conditions read] [Type: SurfaceLoadCondition3D3N]
  [Reading Conditions : 2 conditions read] [Type: SurfaceLoadCondition3D3N]
  [Reading Conditions : 2 conditions read] [Type: SurfaceLoadCondition3D3N]
  [Total Lines Read : 240]
.  [Reading Nodes    : 16 nodes read]
  [Reading Elements : 5 elements read] [Type: SmallDisplacementElement3D8N]
  [Reading Conditions : 1 conditions read] [Type: SurfaceLoadCondition3D4N]
  [Total Lines Read : 131]
.  [Reading Nodes    : 16 nodes read]
  [Reading Elements : 30 elements read] [Type: SmallDisplacementElement3D4N]
  [Reading Conditions : 2 conditions read] [Type: SurfaceLoadCondition3D3N]
  [Total Lines Read : 183]
.  [Reading Nodes    : 8 nodes read]
  [Reading Elements : 5 elements read] [Type: TotalLagrangianElement2D4N]
  [Reading Conditions : 1 conditions read] [Type: LineLoadCondition2D2N]
  [Reading Conditions : 1 conditions read] [Type: LineLoadCondition2D2N]
  [Reading Conditions : 1 conditions read] [Type: LineLoadCondition2D2N]
  [Reading Conditions : 1 conditions read] [Type: LineLoadCondition2D2N]
  [Total Lines Read : 132]
.  [Reading Nodes    : 8 nodes read]
  [Reading Elements : 10 elements read] [Type: TotalLagrangianElement2D3N]
  [Reading Conditions : 1 conditions read] [Type: LineLoadCondition2D2N]
  [Reading Conditions : 1 conditions read] [Type: LineLoadCondition2D2N]
  [Reading Conditions : 1 conditions read] [Type: LineLoadCondition2D2N]
philbucher commented 6 years ago

This is normal I think these prints come from ModelPartIO Also setting the verbosity to zero doesn't remove then.

maybe @roigcarlo knows more?

pooyan-dadvand commented 6 years ago

@roigcarlo would you please pass the tester output stream to the default logger output?