ElmerCSC / elmerfem

Official git repository of Elmer FEM software
http://www.elmerfem.org
Other
1.16k stars 316 forks source link

Test `CurvedBoundaryCylHquadratic` failing on Windows (MinGW) #521

Closed mmuetzel closed 3 weeks ago

mmuetzel commented 1 month ago

The test CurvedBoundaryCylHquadratic is currently failing on Windows with the following output:

$ ../../src/ElmerSolver
ELMER SOLVER (v 9.0) STARTED AT: 2024/08/09 17:52:14
ParCommInit:  Initialize #PEs:            1
MAIN:
MAIN: =============================================================
MAIN: ElmerSolver finite element software, Welcome!
MAIN: This program is free software licensed under (L)GPL
MAIN: Copyright 1st April 1995 - , CSC - IT Center for Science Ltd.
MAIN: Webpage http://www.csc.fi/elmer, Email elmeradm@csc.fi
MAIN: Version: 9.0 (Rev: 733f7f2af, Compiled: 2024-08-09)
MAIN:  Running one task without MPI parallelization.
MAIN:  Running with just one thread per task.
MAIN:  Zoltan library linked in.
MAIN: =============================================================
MAIN:
MAIN:
MAIN: -------------------------------------
MAIN: Reading Model: case.sif
LoadInputFile: Scanning input file: case.sif
LoadInputFile: Scanning only size info
LoadInputFile: First time visiting
LoadInputFile: Reading base load of sif file
LoadInputFile: Loading input file: case.sif
LoadInputFile: Reading base load of sif file
CheckKeyword:  Unlisted keyword: [vtu: constraint modes analysis] in section: [simulation]
CheckKeyword:  Unlisted keyword: [increase element order] in section: [simulation]
CheckKeyword:  Unlisted keyword: [constraint modes matrix results] in section: [solver 1]
CheckKeyword:  Unlisted keyword: [expression 1] in section: [solver 2]
CheckKeyword:  Unlisted keyword: [expression 2] in section: [solver 2]
CheckKeyword:  Unlisted keyword: [follow cylinder boundary] in section: [boundary condition 1]
CheckKeyword:  Unlisted keyword: [follow cylinder boundary] in section: [boundary condition 2]
LoadInputFile: Number of BCs: 4
LoadInputFile: Number of Body Forces: 0
LoadInputFile: Number of Initial Conditions: 0
LoadInputFile: Number of Materials: 1
LoadInputFile: Number of Equations: 1
LoadInputFile: Number of Solvers: 2
LoadInputFile: Number of Bodies: 2
ListTagKeywords: Setting weight for keywords!
ListTagKeywords: No parameters width suffix: normalize by area
ListTagKeywords: Setting weight for keywords!
ListTagKeywords: No parameters width suffix: normalize by volume
ElmerAsciiMesh: Base mesh name: ./cylinder_in_cylinder
CylinderFit: Axis coordinate set to be: 1
ERROR:: LUDecomp: Matrix is singular.
ERROR:: InvertMatrix: inversion needs successfull LU-decomposition
STOP 1

Compared with the output of the same test on Ubuntu 24.04:

$ ../../src/ElmerSolver
ELMER SOLVER (v 9.0) STARTED AT: 2024/08/09 17:53:16
ParCommInit:  Initialize #PEs:            1
MAIN: 
MAIN: =============================================================
MAIN: ElmerSolver finite element software, Welcome!
MAIN: This program is free software licensed under (L)GPL
MAIN: Copyright 1st April 1995 - , CSC - IT Center for Science Ltd.
MAIN: Webpage http://www.csc.fi/elmer, Email elmeradm@csc.fi
MAIN: Version: 9.0 (Rev: 2766add03, Compiled: 2024-08-09)
MAIN:  Running one task without MPI parallelization.
MAIN:  Running with just one thread per task.
MAIN: =============================================================
MAIN: 
MAIN: 
MAIN: -------------------------------------
MAIN: Reading Model: case.sif
LoadInputFile: Scanning input file: case.sif
LoadInputFile: Scanning only size info
LoadInputFile: First time visiting
LoadInputFile: Reading base load of sif file
LoadInputFile: Loading input file: case.sif
LoadInputFile: Reading base load of sif file
CheckKeyword:  Unlisted keyword: [vtu: constraint modes analysis] in section: [simulation]
CheckKeyword:  Unlisted keyword: [increase element order] in section: [simulation]
CheckKeyword:  Unlisted keyword: [constraint modes matrix results] in section: [solver 1]
CheckKeyword:  Unlisted keyword: [expression 1] in section: [solver 2]
CheckKeyword:  Unlisted keyword: [expression 2] in section: [solver 2]
CheckKeyword:  Unlisted keyword: [follow cylinder boundary] in section: [boundary condition 1]
CheckKeyword:  Unlisted keyword: [follow cylinder boundary] in section: [boundary condition 2]
LoadInputFile: Number of BCs: 4
LoadInputFile: Number of Body Forces: 0
LoadInputFile: Number of Initial Conditions: 0
LoadInputFile: Number of Materials: 1
LoadInputFile: Number of Equations: 1
LoadInputFile: Number of Solvers: 2
LoadInputFile: Number of Bodies: 2
ListTagKeywords: Setting weight for keywords!
ListTagKeywords: No parameters width suffix: normalize by area
ListTagKeywords: Setting weight for keywords!
ListTagKeywords: No parameters width suffix: normalize by volume
ElmerAsciiMesh: Base mesh name: ./cylinder_in_cylinder
CylinderFit: Axis coordinate set to be: 3
CylinderFit: Axis coordinate set to be: 3
IncreaseElementOrder: Increasing element order from linear to quadratic!
ReleaseMeshFaceTables: Releasing number of faces: 674
IncreaseElementOrder: Elements increased to 2nd order serendipity elements
LoadMesh: Elapsed REAL time:     0.0051 (s)
MAIN: -------------------------------------
AddVtuOutputSolverHack: Adding ResultOutputSolver to write VTU output in file: case
StatElecSolver_init: Using Constraint Modes functionality for Capacitance Matrix
StatElecSolver_init: Suppressing bandwidth optimization in Capacitance Matrix computation!
OptimizeBandwidth: Initial bandwidth for electrostatics: 556
MAIN: Number of timesteps to be saved: 1
MAIN: 
MAIN: -------------------------------------
MAIN:  Steady state iteration:            1
MAIN: -------------------------------------
MAIN: 
StatElecSolver: ------------------------------------------------
StatElecSolver: Solving static electric field for insulators
StatElecSolverBulkAssembly: Elapsed REAL time:     0.0242 (s)
StatElecSolverBCAssembly: Elapsed REAL time:     0.0026 (s)
BiCGStabl:        5 0.8135E-10 0.8135E-10
ComputeChange: NS (ITER=1) (NRM,RELC): ( 0.62583759      2.0000000     ) :: electrostatics
StoreLumpedFluxes: Allocating lumped model of size: 1
FinalizeLumpedMatrix: Lumped Matrix
FinalizeLumpedMatrix:   1  1  1.001402000E+00
FinalizeLumpedMatrix: Constraint modes fluxes was saved to file CapacitanceMatrix.dat
FinalizeLumpedMatrix:  Normalization checksum:
FinalizeLumpedMatrix: Adding Constraint Modes Fluxes with "res:" to list
ComputeChange: SS (ITER=1) (NRM,RELC): ( 0.62583759      2.0000000     ) :: electrostatics
SaveScalars: -----------------------------------------
SaveScalars: Saving scalar values of various kinds
SaveScalars: Saving names of values to file: ./f.dat.names
SaveScalars: Saving values to file: ./f.dat
ComputeChange: SS (ITER=1) (NRM,RELC): (  1.0014020      0.0000000     ) :: savescalars
ComputeChange: SS (ITER=1) (NRM,RELC): (  1.0014020      2.0000000     ) :: savescalars
ResultOutputSolver: -------------------------------------
ResultOutputSolver: Saving with prefix: case
ResultOutputSolver: Creating list for saving - if not present
CreateListForSaving: Field Variables for Saving
ResultOutputSolver: Saving in unstructured VTK XML (.vtu) format
VtuOutputSolver: Saving results in VTK XML format with prefix: case
VtuOutputSolver: Saving number of partitions: 1
VtuOutputSolver: Saving each mode to different file
ResultOutputSolver: -------------------------------------
CompareToReferenceSolution: Solver 1 PASSED:  Norm = 6.25837594E-01  RefNorm = 6.25837594E-01
CompareToReferenceSolution: Relative Error to reference norm: 1.881983E-10
CompareToReferenceSolution: Solver 2 PASSED:  Norm = 1.00140200E+00  RefNorm = 1.00140200E+00
CompareToReferenceSolution: Relative Error to reference norm: 4.099167E-10
CompareToReferenceSolution: PASSED all 2 tests!
MAIN: *** Elmer Solver: ALL DONE ***
MAIN: The end
SOLVER TOTAL TIME(CPU,REAL):         0.13        0.60
ELMER SOLVER FINISHED AT: 2024/08/09 17:53:16

Notably, CylinderFit: Axis coordinate set to be: is 1 on Windows and 3 on Ubuntu.

When running with gdb to a breakpoint at MeshUtils.F90:14040, I see the following on Ubuntu:

Thread 1 "ElmerSolver" hit Breakpoint 1, meshutils::cylinderfit (pmesh=0x5555557fdfc0, pparams=0x5555558f9df0, bcind=1, dim=3, fitparams=...)
    at .../elmerfem/fem/src/MeshUtils.F90:14040
14040       AxisI = 1 
(gdb) p NiNj
$1 = (31.906038461164012, -1.4857677310686768e-09, 1.3877787807814457e-16, -1.4857677310686768e-09, 31.906038464428274, 5.5511151231257827e-17, 1.3877787807814457e-16, 5.5511151231257827e-17, 0.18792307440770833)

While I see the following on Windows:

Thread 1 hit Breakpoint 1, meshutils::cylinderfit (pmesh=0x7940a180, pparams=0x78fd5fc0, bcind=1, dim=<optimized out>, fitparams=...)
    at .../elmerfem/fem/src/MeshUtils.F90:14040
14040       AxisI = 1
(gdb) p NiNj
$1 = (0, 0, 0, 0, 0, 0, 0, 0, 0)

So, it looks like NiNj hasn't been set on Windows for some reason.

Do you have any hints how to look deeper into this issue?

mmuetzel commented 1 month ago

It looks like it's the same as on Ubuntu until: https://github.com/ElmerCSC/elmerfem/blob/2766add0317ee0ccae8ef37e0559293ab8b44991/fem/src/MeshUtils.F90#L14034-L14035

After that, NiNj is all zeros.

Is it an issue with MSMPI?

mmuetzel commented 4 weeks ago

It looks like there is an issue with MPI_IN_PLACE when using MSMPI. (That might very well be an upstream issue.)

The following change fixes that test for me:

From 68bfc2c31ddc3f65e25c798602b402d869b1200b Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Markus=20M=C3=BCtzel?= <markus.muetzel@gmx.de>
Date: Mon, 12 Aug 2024 15:08:15 +0200
Subject: [PATCH] Avoid `MPI_IN_PLACE` in MeshUtils.F90

Fixes #521.
---
 fem/src/MeshUtils.F90 | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/fem/src/MeshUtils.F90 b/fem/src/MeshUtils.F90
index 499d7874b..6bae63427 100644
--- a/fem/src/MeshUtils.F90
+++ b/fem/src/MeshUtils.F90
@@ -13922,7 +13922,7 @@ CONTAINS
     REAL(KIND=dp) :: NiNj(9),A(3,3),F(3),M11,M12,M13,M14
     REAL(KIND=dp) :: d1,d2,MinDist,MaxDist,Dist,X0,Y0,Rad
     REAL(KIND=dp) :: Normal(3), AxisNormal(3), Tangent1(3), Tangent2(3), Coord(3), &
-        CircleCoord(9)
+        CircleCoord(9), buffer(9)
     INTEGER :: CircleInd(3) 
     LOGICAL :: BCMode, DoIt, GotNormal, GotCenter, GotRadius
     INTEGER :: Tag, t1, t2
@@ -14031,7 +14031,8 @@ CONTAINS
     ! Only in BC mode we do currently parallel reduction.
     ! This could be altered too. 
     IF( BCMode ) THEN
-      CALL MPI_ALLREDUCE(MPI_IN_PLACE,NiNj,9, &
+      buffer = NiNj;
+      CALL MPI_ALLREDUCE(buffer,NiNj,9, &
           MPI_DOUBLE_PRECISION,MPI_SUM,ELMER_COMM_WORLD,ierr)
     END IF

@@ -14131,7 +14132,8 @@ CONTAINS
     END DO

     IF( BCMode .AND. ParEnv % PEs > 1 ) THEN
-      CALL MPI_ALLREDUCE(MPI_IN_PLACE,CircleCoord,6, &
+      buffer = CircleCoord;
+      CALL MPI_ALLREDUCE(buffer,CircleCoord,6, &
           MPI_DOUBLE_PRECISION,MPI_MAX,ELMER_COMM_WORLD,ierr)
     END IF

@@ -14190,7 +14192,8 @@ CONTAINS
     END IF

     IF( BCMode .AND. ParEnv % PEs > 1 ) THEN
-      CALL MPI_ALLREDUCE(MPI_IN_PLACE,CircleCoord,9, &
+      buffer = CircleCoord;
+      CALL MPI_ALLREDUCE(buffer,CircleCoord,9, &
           MPI_DOUBLE_PRECISION,MPI_MAX,ELMER_COMM_WORLD,ierr)
     END IF

-- 
2.44.0.windows.1

That's probably not how this should be "fixed". Instead, it is likely that something isn't quite correct with the MSMPI headers (or import library) from MSYS2 (or MSMPI itself).

I wonder whether the value of MPI_IN_PLACE is imported correctly with gfortran given that there is the following code snippet in the mpif.h of MSMPI:

       COMMON /MPIPRIV1/ MPI_BOTTOM, MPI_IN_PLACE, MPI_STATUS_IGNORE

       COMMON /MPIPRIV2/ MPI_STATUSES_IGNORE, MPI_ERRCODES_IGNORE
!DEC$ ATTRIBUTES DLLIMPORT :: /MPIPRIV1/, /MPIPRIV2/

       COMMON /MPIFCMB5/ MPI_UNWEIGHTED
       COMMON /MPIFCMB9/ MPI_WEIGHTS_EMPTY
!DEC$ ATTRIBUTES DLLIMPORT :: /MPIFCMB5/, /MPIFCMB9/

       COMMON /MPIPRIVC/ MPI_ARGVS_NULL, MPI_ARGV_NULL
!DEC$ ATTRIBUTES DLLIMPORT :: /MPIPRIVC/

IIUC, !DEC$ attributes are ignored when building with gfortran. And setting attributes for common blocks doesn't work in gfortran: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=47030

I haven't found a way around that.

Would it be acceptable to conditionally not use MPI_IN_PLACE and copy the input instead? Maybe, depending on a feature test result? But if the issue is caused by the missing DLLIMPORT attributes, there might also be issues with the other common block members (e.g., MPI_STATUSES_IGNORE).

Edit: Fixed copy-paste-error in patch.

mmuetzel commented 3 weeks ago

I think I have changes almost ready that use MPI_IN_PLACE only conditionally if a feature test during configuration succeeds.

However, MPI_IN_PLACE is also used in ElmerIce for which no CI is run currently. #522 adds building ElmerIce to the CI rules. I'd like to wait until after that is merged to be a bit more certain that the changes I'd like to propose don't cause regressions.