festim-dev / FESTIM

Coupled hydrogen/tritium transport and heat transfer modelling using FEniCS
https://festim.readthedocs.io/en/stable/
Apache License 2.0
78 stars 16 forks source link

Update dolfinx 0.8 #764

Open RemDelaporteMathurin opened 2 months ago

RemDelaporteMathurin commented 2 months ago

This PR just updates the version of dolfinx to 0.8.0

codecov[bot] commented 2 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 98.43%. Comparing base (0b02ffa) to head (9166f30).

:exclamation: Current head 9166f30 differs from pull request most recent head 02162a7. Consider uploading reports for the commit 02162a7 to get more accurate results

Additional details and impacted files ```diff @@ Coverage Diff @@ ## fenicsx #764 +/- ## =========================================== + Coverage 98.37% 98.43% +0.06% =========================================== Files 28 28 Lines 1540 1537 -3 =========================================== - Hits 1515 1513 -2 + Misses 25 24 -1 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

RemDelaporteMathurin commented 2 months ago

There is a random error ocuring on both conda and docker when running test_xdmf.py. Although it seems that all the tests pass in this file so I don't really know what is going on here.

[0]PETSC ERROR: ------------------------------------------------------------------------
[0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
[0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[0]PETSC ERROR: or see https://petsc.org/release/faq/#valgrind and https://petsc.org/release/faq/
[0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run 
[0]PETSC ERROR: to get more information on the crash.
[0]PETSC ERROR: Run with -malloc_debug to check if memory corruption is causing the crash.
Abort(59) on node 0 (rank 0 in comm 0): application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
RemDelaporteMathurin commented 2 months ago

The error that we are seeing here might be related to https://github.com/FEniCS/dolfinx/issues/3162

We'll wait for a fix before merging this

jhdark commented 2 months ago

There is now 0.8.1 released, not sure if that helps with this bug?

RemDelaporteMathurin commented 2 months ago

I've narrowed down the error to using

my_model.solver.convergence_criterion = "incremental"
ksp = my_model.solver.krylov_solver
opts = PETSc.Options()
option_prefix = ksp.getOptionsPrefix()
opts[f"{option_prefix}ksp_type"] = "cg"
opts[f"{option_prefix}pc_type"] = "gamg"
opts[f"{option_prefix}pc_factor_mat_solver_type"] = "mumps"
ksp.setFromOptions()

Running just test_multispecies_problem gives:

Solving H transport problem: 100%|█████████████████████████████████████████████████████████████████████████| 10.0/10.0 [00:02<00:00, 4.74it/s]
WARNING! There are options you set that were not used!
WARNING! could be spelling mistake, etc!
There is one unused database option. It is:
Option left: name:-nls_solve_pc_factor_mat_solver_type value: mumps source: code
RemDelaporteMathurin commented 2 months ago

MWE to reproduce the random segmentation fault

import numpy as np
import festim as F
from petsc4py import PETSc

for i in range(2):

    my_model = F.HydrogenTransportProblem()
    my_model.mesh = F.Mesh1D(np.linspace(0, 1, num=1000))

    my_mat = F.Material(D_0=1.9e-7, E_D=0.2, name="my_mat")
    my_subdomain = F.VolumeSubdomain1D(id=1, borders=[0, 1], material=my_mat)
    my_model.subdomains = [my_subdomain]

    my_model.species = [F.Species("H")]

    my_model.temperature = 500

    my_model.settings = F.Settings(atol=1e10, rtol=1e-10, transient=False)

    my_model.initialise()

    my_model.solver.convergence_criterion = "incremental"
    ksp = my_model.solver.krylov_solver
    opts = PETSc.Options()
    option_prefix = ksp.getOptionsPrefix()
    opts[f"{option_prefix}ksp_type"] = "cg"
    opts[f"{option_prefix}pc_type"] = "gamg"
    opts[f"{option_prefix}pc_factor_mat_solver_type"] = "mumps"
    ksp.setFromOptions()

    my_model.run()

Randomly produces

[0]PETSC ERROR: ------------------------------------------------------------------------
[0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
[0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[0]PETSC ERROR: or see https://petsc.org/release/faq/#valgrind and https://petsc.org/release/faq/
[0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run 
[0]PETSC ERROR: to get more information on the crash.
[0]PETSC ERROR: Run with -malloc_debug to check if memory corruption is causing the crash.
Abort(59) on node 0 (rank 0 in comm 0): application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
RemDelaporteMathurin commented 2 months ago

I noticed that removing the lines

my_model.solver.convergence_criterion = "incremental"
ksp = my_model.solver.krylov_solver
opts = PETSc.Options()
option_prefix = ksp.getOptionsPrefix()
opts[f"{option_prefix}ksp_type"] = "cg"
opts[f"{option_prefix}pc_type"] = "gamg"
opts[f"{option_prefix}pc_factor_mat_solver_type"] = "mumps"
ksp.setFromOptions()

removes the random segfault