Open MatthiasHeilManchester opened 8 months ago
Hi @MatthiasHeilManchester Can you please copy AMG setup parameters as you did for the solve phase, something like the following, for both versions? Thanks
BoomerAMG SETUP PARAMETERS:
Max levels = 25
Num levels = 5
Strength Threshold = 0.250000
Interpolation Truncation Factor = 0.000000
Maximum Row Sum Threshold for Dependency Weakening = 1.000000
Coarsening Type = HMIS
measures are determined locally
No global partition option chosen.
Interpolation = extended+i interpolation
Operator Matrix Information:
nonzero entries/row row sums
lev rows entries sparse min max avg min max
======================================================================
0 1000 6400 0.006 4 7 6.4 0.000e+00 3.000e+00
1 500 7248 0.029 7 17 14.5 0.000e+00 4.000e+00
2 99 2999 0.306 15 43 30.3 1.041e-02 5.319e+00
3 14 188 0.959 11 14 13.4 5.274e+00 1.007e+01
4 4 16 1.000 4 4 4.0 7.597e+00 9.192e+00
Interpolation Matrix Information:
entries/row min max row sums
lev rows x cols min max avgW weight weight min max
================================================================================
0 1000 x 500 1 4 4.0 1.667e-01 2.500e-01 5.000e-01 1.000e+00
1 500 x 99 1 4 4.0 1.301e-02 3.547e-01 2.164e-01 1.000e+00
2 99 x 14 1 4 4.0 1.247e-03 3.929e-01 2.865e-02 1.000e+00
3 14 x 4 1 4 3.6 -6.321e-02 6.629e-02 -6.118e-02 1.000e+00
@liruipeng : Sure (and sorry about the delay): here's the full setup/solve information for a representative run. I'd omitted the explicit listing of the output from the setup phase in the original post because there doesn't seem to be much difference, but maybe I've overlooked something. This is from one of the runs where using the newer version of hypre causes an increase in the (outer) iteration count by a modest amount. This is fairly typical (and would probably have flown under the radar). However, there are a couple of tests where the increase is so dramatic that the outermost Newton solver fails to converge (the outer linear solver bails out after 100 iterations). In that case we're using Falgout-CLJP as the Coarsening Type. Again I can't see much difference but can send the full output for that too if it rings a bell.
version 2.3.0:
Num MPI tasks = 2
Num OpenMP threads = 1
BoomerAMG SETUP PARAMETERS:
Max levels = 100
Num levels = 5
Strength Threshold = 0.250000
Interpolation Truncation Factor = 0.000000
Maximum Row Sum Threshold for Dependency Weakening = 1.000000
Coarsening Type = Cleary-Luby-Jones-Plassman
No global partition option chosen.
Interpolation = modified classical interpolation
Operator Matrix Information:
nonzero entries/row row sums
lev rows entries sparse min max avg min max
======================================================================
0 299 7033 0.079 11 27 23.5 -1.042e+01 1.250e+01
1 138 5154 0.271 12 61 37.3 -8.724e+00 1.285e+01
2 42 980 0.556 10 40 23.3 -1.175e+00 1.024e+01
3 19 307 0.850 12 19 16.2 6.970e-01 8.513e+00
4 7 49 1.000 7 7 7.0 2.166e+00 7.709e+00
Interpolation Matrix Information:
entries/row min max row sums
lev rows x cols min max avgW weight weight min max
================================================================================
0 299 x 138 1 6 3.2 3.125e-02 4.093e-01 2.308e-01 1.608e+00
1 138 x 42 1 7 2.0 2.135e-02 4.676e-01 1.122e-01 1.064e+00
2 42 x 19 1 4 2.3 2.249e-02 6.693e-01 1.009e-01 1.276e+00
3 19 x 7 1 2 1.6 4.752e-02 5.604e-01 1.328e-01 1.000e+00
Complexity: grid = 1.688963
operator = 1.922793
memory = 2.062562
BoomerAMG SOLVER PARAMETERS:
Maximum number of cycles: 1
Stopping Tolerance: 0.000000e+00
Cycle type (1 = V, 2 = W, etc.): 1
Relaxation Parameters:
Visiting Grid: down up coarse
Number of sweeps: 2 2 1
Type 0=Jac, 3=hGS, 6=hSGS, 9=GE: 1 1 9
Point types, partial sweeps (1=C, -1=F):
Pre-CG relaxation (down): 0 0
Post-CG relaxation (up): 0 0
Coarsest grid: 0
version 2.0.0:
BoomerAMG SETUP PARAMETERS:
Max levels = 100
Num levels = 5
Strength Threshold = 0.250000
Interpolation Truncation Factor = 0.000000
Maximum Row Sum Threshold for Dependency Weakening = 1.000000
Coarsening Type = Cleary-Luby-Jones-Plassman
Interpolation = modified classical interpolation
Operator Matrix Information:
nonzero entries per row row sums
lev rows entries sparse min max avg min max
===================================================================
0 299 7033 0.079 11 27 23.5 -1.042e+01 1.250e+01
1 138 5154 0.271 12 61 37.3 -8.724e+00 1.285e+01
2 42 980 0.556 10 40 23.3 -1.175e+00 1.024e+01
3 19 307 0.850 12 19 16.2 6.970e-01 8.513e+00
4 7 49 1.000 7 7 7.0 2.166e+00 7.709e+00
Interpolation Matrix Information:
entries/row min max row sums
lev rows cols min max weight weight min max
=================================================================
0 299 x 138 1 6 3.125e-02 4.093e-01 2.308e-01 1.608e+00
1 138 x 42 1 7 2.135e-02 4.676e-01 1.122e-01 1.064e+00
2 42 x 19 1 4 2.249e-02 6.693e-01 1.009e-01 1.276e+00
3 19 x 7 1 2 4.752e-02 5.604e-01 1.328e-01 1.000e+00
Complexity: grid = 1.688963
operator = 1.922793
BoomerAMG SOLVER PARAMETERS:
Maximum number of cycles: 1
Stopping Tolerance: 0.000000e+00
Cycle type (1 = V, 2 = W, etc.): 1
Relaxation Parameters:
Visiting Grid: down up coarse
Number of partial sweeps: 2 2 1
Type 0=Jac, 3=hGS, 6=hSGS, 9=GE: 1 1 9
Point types, partial sweeps (1=C, -1=F):
Pre-CG relaxation (down): 1 -1 1 -1
Post-CG relaxation (up): -1 1 -1 1
Coarsest grid: 0
@liruipeng : ...actually just for completeness, here's the corresponding output from the more dramatic case. With the old version of hypre, the Newton solver takes three steps, and for each solve the outer GMRES iteration converges in about 20 iterations. With the new version, GMRES bails out after 100 iterations and the final approximation to the solution is so poor that the Newton method fails to converge within 10 iterations, at which point the code bails completely.
Anyway, here's the stats:
version 2.3.0:
BoomerAMG SETUP PARAMETERS:
Max levels = 100
Num levels = 7
Strength Threshold = 0.250000
Interpolation Truncation Factor = 0.000000
Maximum Row Sum Threshold for Dependency Weakening = 1.000000
Coarsening Type = Falgout-CLJP
measures are determined locally
No global partition option chosen.
Interpolation = modified classical interpolation
Operator Matrix Information:
nonzero entries/row row sums
lev rows entries sparse min max avg min max
======================================================================
0 720 10428 0.020 6 25 14.5 -2.991e-01 2.393e+00
1 463 10201 0.048 5 39 22.0 -2.829e-15 8.831e-01
2 176 4292 0.139 6 37 24.4 -1.954e-15 1.228e+00
3 68 1244 0.269 5 24 18.3 -7.041e-04 1.068e+00
4 28 370 0.472 9 18 13.2 -4.570e-09 8.076e-01
5 13 109 0.645 7 13 8.4 -5.551e-16 9.961e-01
6 5 21 0.840 3 5 4.2 1.466e-06 8.137e-01
Interpolation Matrix Information:
entries/row min max row sums
lev rows x cols min max avgW weight weight min max
================================================================================
0 720 x 463 1 6 4.1 1.059e-01 5.000e-01 4.262e-01 1.000e+00
1 463 x 176 1 8 2.9 5.556e-02 1.000e+00 6.288e-01 1.000e+00
2 176 x 68 1 5 2.3 9.017e-02 1.000e+00 4.337e-01 1.000e+00
3 68 x 28 1 4 2.6 4.887e-02 1.000e+00 1.635e-01 1.000e+00
4 28 x 13 1 4 2.5 1.326e-01 1.000e+00 6.872e-01 1.000e+00
5 13 x 5 1 2 1.5 2.910e-01 1.000e+00 4.055e-01 1.000e+00
Complexity: grid = 2.045833
operator = 2.557058
memory = 2.846663
BoomerAMG SOLVER PARAMETERS:
Maximum number of cycles: 1
Stopping Tolerance: 0.000000e+00
Cycle type (1 = V, 2 = W, etc.): 1
Relaxation Parameters:
Visiting Grid: down up coarse
Number of sweeps: 2 2 1
Type 0=Jac, 3=hGS, 6=hSGS, 9=GE: 0 0 9
Point types, partial sweeps (1=C, -1=F):
Pre-CG relaxation (down): 0 0
Post-CG relaxation (up): 0 0
Coarsest grid: 0
version2.0.0:
BoomerAMG SETUP PARAMETERS:
Max levels = 100
Num levels = 7
Strength Threshold = 0.250000
Interpolation Truncation Factor = 0.000000
Maximum Row Sum Threshold for Dependency Weakening = 1.000000
Coarsening Type = Falgout-CLJP
measures are determined locally
Interpolation = modified classical interpolation
Operator Matrix Information:
nonzero entries per row row sums
lev rows entries sparse min max avg min max
===================================================================
0 720 10428 0.020 6 25 14.5 -2.991e-01 2.393e+00
1 501 10915 0.043 6 39 21.8 -1.583e-01 2.424e+00
2 207 4261 0.099 6 35 20.6 -1.510e-02 1.179e+00
3 91 1757 0.212 7 27 19.3 -5.146e-04 1.187e+00
4 38 534 0.370 8 21 14.1 -3.053e-15 1.287e+00
5 11 81 0.669 5 10 7.4 -2.056e-15 9.110e-01
6 4 16 1.000 4 4 4.0 3.273e-05 6.876e-01
Interpolation Matrix Information:
entries/row min max row sums
lev rows cols min max weight weight min max
=================================================================
0 720 x 501 1 6 1.059e-01 5.000e-01 4.262e-01 1.000e+00
1 501 x 207 1 7 6.442e-02 1.000e+00 4.242e-01 1.000e+00
2 207 x 91 1 5 9.071e-02 1.004e+00 4.747e-01 1.021e+00
3 91 x 38 1 5 1.013e-01 1.000e+00 3.953e-01 1.000e+00
4 38 x 11 1 4 9.807e-02 1.000e+00 3.773e-01 1.000e+00
5 11 x 4 1 2 3.416e-01 1.000e+00 5.490e-01 1.000e+00
Complexity: grid = 2.183333
operator = 2.684311
BoomerAMG SOLVER PARAMETERS:
Maximum number of cycles: 1
Stopping Tolerance: 0.000000e+00
Cycle type (1 = V, 2 = W, etc.): 1
Relaxation Parameters:
Visiting Grid: down up coarse
Number of partial sweeps: 2 2 1
Type 0=Jac, 3=hGS, 6=hSGS, 9=GE: 0 0 9
Point types, partial sweeps (1=C, -1=F):
Pre-CG relaxation (down): 1 -1 1 -1
Post-CG relaxation (up): -1 1 -1 1
Coarsest grid: 0
We've recently upgraded Hypre v2.0.0 to v2.30.0. We use BoomerAMG (with a fixed number of cycles) as a preconditioner and noticed that following the upgrade the iteration counts in the (outer) iterative solver increased by more than the tolerances in our self-tests accept (and sometimes catastrophically so).
The difference only occurs when the code is run in parallel; when run serially (or with
mpirun -np 1
) the behaviour is essentially the same as in v2.0.0.We only had to make two straightforward changes to the code (listed at the end) to make it compile with v2.30.0. Apart from those changes, our own code remains completely unchanged, so the behaviour appears to be entirely due to the different version of hypre that we're linking against.
We have looked through the Hypre Changelog but couldn't find anything that would explain the changed behaviour. Any hints would be appreciated and we're obviously happy to provide further diagnostics (this is all done on ubuntu using openmpi).
Diagnostics:
I have called
to set the print level to 3 (= as verbose as possible).
Comparing the output from v2.0.0 and 2.30.0 (running on two processors using mpi, and one OpenMP thread) then shows the following differences:
version 2.30.0 outputs "
No global partition option chosen
" which, based on the code inparcsr_ls/par_stats.c
, is output all the time and is probably a relic of the removal of an associated option in version 2.21.0 (see Hypre Changelog).Version 2.30.0 outputs "
Interpolation = extended+i interpolation
" whereas version 2.2.0 says "Interpolation = modified classical interpolation
". This can be reset by callingHYPRE_BoomerAMGSetOldDefault(...)
. Doing this made no real difference.when setting the print level to 3, the solver parameters are output as follows:
version 2.30.0:
version 2.0.0:
Possibly just a change in the way the Pre/Post-CG relaxation is documented (I don't fully understand the meaning of this output, so can't judge if this is just cosmetic or an indication of an actual change to what the code does).
Interface changes:
To compile with the new version of Hypre we had to make two changes:
and
Everything else stays the same.