FEniCS / dolfinx

Next generation FEniCS problem solving environment
https://fenicsproject.org
GNU Lesser General Public License v3.0
729 stars 177 forks source link

[BUG]: complex demo_axis, demo_pml sometimes fails #2686

Open drew-parsons opened 1 year ago

drew-parsons commented 1 year ago

How to reproduce the bug

The complex-number python demo in dolfinx 0.6, demo_axis.py, often (though not always) fails on less common architectures like armel, armhf, s390x.

The error is a somewhat larger error in the complex calculated quantities, scattering error 16.9% instead of 0.36%, extinction error 1.65% instead of 0.40%. A sample test error from s390x is copied below

The demo does not always fail, e.g. armel apparently passed with glibc 2.37 here
Other less common architectures pass, such as ppc64el

Could there be assumptions in the complex number code that don't hold on all architectures? There is some Redhat gcc complex number discussion, for instance, at https://bugzilla.redhat.com/show_bug.cgi?id=1918519

Minimal Example (Python)

demo_axis.py

Output (Python)

1496s =================================== FAILURES ===================================
1496s _____________________ test_demos_mpi[path15-demo_axis.py] ______________________
...
1496s ----------------------------- Captured stdout call -----------------------------
...
1496s Info    : Done meshing 2D (Wall 0.178272s, CPU 0.179425s)
1496s Info    : 2391 nodes 4950 elements
1496s 
1496s The analytical absorption efficiency is 0.9622728008329892
1496s The numerical absorption efficiency is 0.9583126467646697
1496s The error is 0.4115417233960486%
1496s 
1496s The analytical scattering efficiency is 0.07770397394691526
1496s The numerical scattering efficiency is 0.0644973472397121
1496s The error is 16.996076309077182%
1496s 
1496s The analytical extinction efficiency is 1.0399767747799045
1496s The numerical extinction efficiency is 1.0228099940043818
1496s The error is 1.6506888607349703%
1496s ----------------------------- Captured stderr call -----------------------------
1496s Traceback (most recent call last):
1496s   File "/tmp/autopkgtest-lxc.tupamcyj/downtmp/build.4cF/src/python/demo/demo_axis/demo_axis.py", line 696, in <module>
1496s     assert err_sca < 0.01
1496s            ^^^^^^^^^^^^^^
1496s AssertionError
1496s --------------------------------------------------------------------------

Version

0.6.0

DOLFINx git commit

debian build 0.6.0-2

Installation

debian CI test run using debian builds from official debian buildds (armel, armfl, s390x)

Additional information

garth-wells commented 1 year ago

See #2556 for an issue with this demo, which might or might not be related.

drew-parsons commented 1 year ago

"less common architectures"... got it failing on i386 now https://ci.debian.net/data/autopkgtest/testing/i386/f/fenics-dolfinx/35412547/log.gz

IgorBaratta commented 1 year ago

I'm working on it. The demo is actually over-complicated and requires some simplifications. But I'll also review the tolerances used.

drew-parsons commented 1 year ago

As Garth indicated, demo_pml.py has similar issues. After rebuilding with gcc-13, I'm now getting the same kind of error in demo_pml.py (on amd64 this time),

----------------------------- Captured stdout call -----------------------------
['mpiexec', '-np', '3', '/usr/bin/python3', 'demo_pml.py']
Info    : Meshing 1D...
Info    : [  0%] Meshing curve 1 (Circle)
...
Info    : [100%] Meshing curve 31 (Line)
Info    : Done meshing 1D (Wall 0.00763787s, CPU 0.00764s)
Info    : Meshing 2D...
Info    : [  0%] Meshing surface 1 (Plane, Frontal-Delaunay)
...
Info    : [100%] Meshing surface 12 (Plane, Frontal-Delaunay)
Info    : Done meshing 2D (Wall 0.847349s, CPU 0.847278s)
Info    : 7767 nodes 16095 elements
Cannot write Esh.bp: VTXWriter (adios2) is not available
Cannot write Esh.bp: VTXWriter (adios2) is not available
Cannot write Esh.bp: VTXWriter (adios2) is not available
Cannot write E.bp: VTXWriter (adios2) is not available
Cannot write E.bp: VTXWriter (adios2) is not available
Cannot write E.bp: VTXWriter (adios2) is not available

The analytical absorption efficiency is 0.9089500187622276
The numerical absorption efficiency is 0.9075812316991292
The error is 0.15058991526974808%

The analytical scattering efficiency is 0.8018061316558375
The numerical scattering efficiency is 0.7911945506164945
The error is 1.3234597018394794%

The analytical extinction efficiency is 1.710756150418065
The numerical extinction efficiency is 1.6987757823156238
The error is 0.7002966553423526%
----------------------------- Captured stderr call -----------------------------
Traceback (most recent call last):
  File "/tmp/autopkgtest.90aLKs/tree/python/demo/demo_pml/demo_pml.py", line 620, in <module>
Traceback (most recent call last):
  File "/tmp/autopkgtest.90aLKs/tree/python/demo/demo_pml/demo_pml.py", line 620, in <module>
Traceback (most recent call last):
  File "/tmp/autopkgtest.90aLKs/tree/python/demo/demo_pml/demo_pml.py", line 620, in <module>
    assert err_sca < 0.01
    assert err_sca < 0.01
           ^^^^^^^^^^^^^^
AssertionError
    assert err_sca < 0.01
           ^^^^^^^^^^^^^^
AssertionError
           ^^^^^^^^^^^^^^
AssertionError
garth-wells commented 1 year ago

@IgorBaratta any update on simplifying the demos?

IgorBaratta commented 1 year ago

I have looked only at demo_axis, but I'm thinking that similar enhancements might apply to demo_pml as well. I plan to address these issues and submit a pull request in the next few days.

mscroggs commented 1 week ago

@IgorBaratta is this still an issue?

IgorBaratta commented 1 week ago

I haven't reviewed this in a while, but I have a branch where I started addressing some of the issues. I'll get back to you soon.