libMesh / libmesh

libMesh github repository
http://libmesh.github.io
GNU Lesser General Public License v2.1
658 stars 286 forks source link

ExodusII_IO_Helper: do not crash when outputting subdomains without elements #3949

Open jmeier opened 2 months ago

jmeier commented 2 months ago

Dear libmesh community,

This issue requests to make write_element_data() (and sister methods) more resistent to missing subdomains.

As you know, Moose uses libmesh intensively also for the output of Exodus files. In addition, the so-called [MeshModifiers] have recently been extended, which allow elements to be moved from one subdomain/block to another subdomain/block. From my point of view this is a very important und highly welcomed feature. It also allows models to be handled on the basis of entire subdomains by simply moving all elements of a subdomain/block. For this purpose, moose offers the handy TimedSubdomainModifier. With this capability one effect araises: Some of the subdoamins/blocks will be empty for some time steps.

In connection with this functionality, issues for error messages from libmesh have been appearing more frequently recently, which look like this (where [SOME ID] is an integer pointing to one of the now empty subdomains):

ibMesh terminating:
ExodusII_IO_Helper: block id [SOME ID] not found in block_ids.
[0] ../src/mesh/exodusII_io_helper.C, line 3266, compiled Aug 20 2024 at 14:09:52

Some of the issues seeing this error:

The approach to leave one element in a subdomain just to avoid this error is a bit of a hack and increases model complexity. I'd like to avoid that.

The error message states the error is thrown in libmesh. According to a backtrace I did with gdb Moose is calling libmesh in line 343 of Exodus.C (method Exodus::outputElementalVariables()). https://github.com/idaholab/moose/blob/82b674a17dca177b6c140de82e9f4ceb14f097ad/framework/src/outputs/Exodus.C#L333-L343

As discussed over here with @GiudGiud, Moose could try remove the non-existing block from the element data before sending it to the exodus writer. That would be a little hacky. We assume that the better fix would be in libmesh, modifying write_element_data (and sister methods is needed) to not die on missing subdomains.

Would it be an option for libmesh to make write_element_data (and sister methods) more resistent to missing subdomains?

Jörg

jmeier commented 2 months ago

I'd like to provide some more details on how to reproduce the error.

Please consider the Moose input file from @Wendy-Ji posted here https://github.com/idaholab/moose/discussions/28485#discussion-7105994

[Problem]
  solve = false
[]

[Mesh]
  [generated]
    type = GeneratedMeshGenerator
    dim = 2
    nx = 10
    ny = 10
  []
  add_subdomain_ids = 1
[]

[Variables]
  [dummy]
    family = MONOMIAL
    order = CONSTANT
    block = '0 1'
  []
[]

[Preconditioning]
  [SMP]
    type = SMP
    full = true
  []
[]

[Executioner]
  type = Transient
  end_time = 2
[]

[Outputs]
  exodus = True
[]

Running this input file gives the error mentioned:

libMesh terminating:
ExodusII_IO_Helper: block id 1 not found in block_ids.
[0] ../src/mesh/exodusII_io_helper.C, line 3273, compiled Sep  6 2024 at 14:21:35

The backtrace using gdb looks like:

Thread 1 "shale-dbg" hit Catchpoint 1 (exception thrown), __cxxabiv1::__cxa_throw (obj=0x555555ee0b10, tinfo=0x7ffff7fb73c8 <typeinfo for libMesh::LogicError>, dest=0x7fffe86870a0 <libMesh::LogicError::~LogicError()>) at ../../../../libstdc++-v3/libsupc++/eh_throw.cc:80
80      in ../../../../libstdc++-v3/libsupc++/eh_throw.cc
(gdb) bt
#0  __cxxabiv1::__cxa_throw (obj=0x555555ee0b10, tinfo=0x7ffff7fb73c8 <typeinfo for libMesh::LogicError>, dest=0x7fffe86870a0 <libMesh::LogicError::~LogicError()>)
    at ../../../../libstdc++-v3/libsupc++/eh_throw.cc:80
#1  0x00007fffe8ea2669 in libMesh::ExodusII_IO_Helper::initialize_element_variables(std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >, std::vector<std::set<unsigned short, std::less<unsigned short>, std::allocator<unsigned short> >, std::allocator<std::set<unsigned short, std::less<unsigned short>, std::allocator<unsigned short> > > > const&) () from /home/mjg/mambaforge3/envs/moose/libmesh/lib/libmesh_dbg.so.0
#2  0x00007fffe8e5df9f in libMesh::ExodusII_IO::write_element_data(libMesh::EquationSystems const&) () from /home/mjg/mambaforge3/envs/moose/libmesh/lib/libmesh_dbg.so.0
#3  0x00007ffff5d8c169 in Exodus::outputElementalVariables (this=0x555555e2ad60) at /home/mjg/projects/moose/framework/src/outputs/Exodus.C:343
#4  0x00007ffff5d68ead in AdvancedOutput::output (this=0x555555e2ad60) at /home/mjg/projects/moose/framework/src/outputs/AdvancedOutput.C:292
#5  0x00007ffff5d8cb89 in Exodus::output (this=0x555555e2ad60) at /home/mjg/projects/moose/framework/src/outputs/Exodus.C:454
#6  0x00007ffff5d9da15 in OversampleOutput::outputStep (this=0x555555e2ad60, type=...) at /home/mjg/projects/moose/framework/src/outputs/OversampleOutput.C:100
#7  0x00007ffff5d9b080 in OutputWarehouse::outputStep (this=0x555555912d00, type=...) at /home/mjg/projects/moose/framework/src/outputs/OutputWarehouse.C:157
#8  0x00007ffff63f68f5 in FEProblemBase::outputStep (this=0x555555d659d0, type=...) at /home/mjg/projects/moose/framework/src/problems/FEProblemBase.C:6314
#9  0x00007ffff5c003cd in Transient::preExecute (this=0x555555dd8350) at /home/mjg/projects/moose/framework/src/executioners/Transient.C:254
#10 0x00007ffff5c00565 in Transient::execute (this=0x555555dd8350) at /home/mjg/projects/moose/framework/src/executioners/Transient.C:283
#11 0x00007ffff661433c in MooseApp::executeExecutioner (this=0x555555912460) at /home/mjg/projects/moose/framework/src/base/MooseApp.C:1178
#12 0x00007ffff661b705 in MooseApp::run (this=0x555555912460) at /home/mjg/projects/moose/framework/src/base/MooseApp.C:1560
#13 0x0000555555558e7e in Moose::main<shaleTestApp> (argc=3, argv=0x7fffffffbed8) at /home/mjg/projects/moose/framework/build/header_symlinks/MooseMain.h:47
#14 0x00005555555586e6 in main (argc=3, argv=0x7fffffffbed8) at /home/mjg/projects/shale/src/main.C:17