awslabs / palace

3D finite element solver for computational electromagnetics
https://awslabs.github.io/palace/dev
Apache License 2.0
246 stars 50 forks source link

Nedelec elements fail to converge for OpenMP threads greater than 1 #279

Closed hughcars closed 3 months ago

hughcars commented 3 months ago

The rings example when run with nt >= 2 fails in the first linear solve, diverging at 100 iterations with a KSP norm of O(1e1). For -nt 1 converges in 47 iterations. Running with MGMaxLevels results in 75 KSP iterations, for -nt 1 and -nt 6.

The spheres example solves in 9 iterations for -nt 6 and -nt 1, suggesting the issue is restricted to nedelec elements.

The cavity example for tet -nt 1 takes 9 iterations, -nt 6 10 iterations, for hex 5 and 5. Suggests not always an issue, but the slight increase in number of iterations suggests it might just not be becoming an issue fast enough on this simpler problem.

The cpw_wave_uniform with -nt 1 takes 23 KSP iterations, with -nt 6 doesn't converge in 200 iterations. cpw_lumped_uniform doesn't converge with -nt 6 and multigrid, does converge without multigrid.

This suggests the issue is:

hughcars commented 3 months ago

The above was tested utilizing -framework Accelerate on a Mac M1. Building from scratch using the armpl instead, this appears to resolve, the number of iterations for rings increases slightly with 6 threads, but does converge, and for the cpw_lumped converges in 23 exactly. Closing this as it appears the issue is related more related to incorrectly configuring the BLAS rather than Palace.