FEniCS / dolfinx

Next generation FEniCS problem solving environment
GNU Lesser General Public License v3.0
699 stars 172 forks source link

[BUG]: Conda test fails 0.8.0 #3179

Open jhale opened 2 months ago

jhale commented 2 months ago

Summarize the issue

Conda linux64 builds of 0.8.0 set gives a test error:

FAILED unit/fem/test_fem_pipeline.py::test_dP_simplex[3-DG-tetrahedron] - AssertionError: assert 4.247720524033913e-06 < 1e-09
 +  where 4.247720524033913e-06 = <ufunc 'absolute'>(4.247720524033913e-06)
 +    where <ufunc 'absolute'> = np.abs

Additionally macOS builds are segfaulting at:


How to reproduce the bug

Unknown, conda build system.

Minimal Example (Python)

No response

Output (Python)

No response



DOLFINx git commit

No response


Conda build system x86-64 Linux.

Additional information

No response

minrk commented 2 months ago

For the segfault, it is in dmumps_scatter_dist_rhs_. This is not the first time I've seen a problem in dmumps_scatter_dist_rhs.

* thread #1, name = 'main', queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x5fff8db6eb50)
    frame #0: 0x000000010e50ed69 libdmumps.dylib`dmumps_scatter_dist_rhs_ + 1241
->  0x10e50ed69 <+1241>: incl   (%r9,%rdx,4)
    0x10e50ed6d <+1245>: incq   %rax
    0x10e50ed70 <+1248>: cmpq   %rax, %rsi
    0x10e50ed73 <+1251>: jne    0x10e50ed50               ; <+1216>
(lldb) bt
* thread #1, name = 'main', queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x5fff8db6eb50)
  * frame #0: 0x000000010e50ed69 libdmumps.dylib`dmumps_scatter_dist_rhs_ + 1241
    frame #1: 0x000000010e507e69 libdmumps.dylib`dmumps_solve_driver_ + 72313
    frame #2: 0x000000010e5708ac libdmumps.dylib`dmumps_ + 3612
    frame #3: 0x000000010e5763f3 libdmumps.dylib`dmumps_f77_ + 7203
    frame #4: 0x000000010e56de19 libdmumps.dylib`dmumps_c + 3289
    frame #5: 0x000000010ba3b746 libpetsc.3.20.6.dylib`MatSolve_MUMPS + 742
    frame #6: 0x000000010bb7c481 libpetsc.3.20.6.dylib`MatSolve + 289
    frame #7: 0x000000010c1c7fa7 libpetsc.3.20.6.dylib`PCApply_LU + 87
    frame #8: 0x000000010c244954 libpetsc.3.20.6.dylib`PCApply + 212
    frame #9: 0x000000010c039be4 libpetsc.3.20.6.dylib`KSPSolve_PREONLY + 308
    frame #10: 0x000000010c09037e libpetsc.3.20.6.dylib`KSPSolve_Private + 1374
    frame #11: 0x000000010c08fdce libpetsc.3.20.6.dylib`KSPSolve + 30
    frame #12: 0x0000000177cc79c8 libslepc.3.20.2.dylib`STMatSolve + 120
    frame #13: 0x0000000177cc8977 libslepc.3.20.2.dylib`STApply_Generic + 87
    frame #14: 0x0000000177cc9863 libslepc.3.20.2.dylib`MatMult_STOperator + 275
    frame #15: 0x000000010b80d937 libpetsc.3.20.6.dylib`MatMult_Shell + 423
    frame #16: 0x000000010bb7084b libpetsc.3.20.6.dylib`MatMult + 235
    frame #17: 0x0000000177cc8a90 libslepc.3.20.2.dylib`STApply + 64
    frame #18: 0x0000000177e0ecbb libslepc.3.20.2.dylib`EPSGetStartVector + 219
    frame #19: 0x0000000177dded45 libslepc.3.20.2.dylib`EPSSolve_KrylovSchur_Default + 229
    frame #20: 0x0000000177e0bfd5 libslepc.3.20.2.dylib`EPSSolve + 517
    frame #21: 0x000000017a6165a8 SLEPc.cpython-310-darwin.so`__pyx_pw_8slepc4py_5SLEPc_3EPS_123solve + 40
minrk commented 2 months ago

The curl-curl segfault is a mumps bug, already reported here: https://github.com/conda-forge/mumps-feedstock/issues/110

RemDelaporteMathurin commented 2 months ago

When updating to 0.8.0 I've started noticing errors in our CI

I also noticed them in the Docker CI though so not sure it's related to this?