idaholab / moose

Multiphysics Object Oriented Simulation Environment
https://www.mooseframework.org
GNU Lesser General Public License v2.1
1.77k stars 1.05k forks source link

Incorrect block IDs in PatternedHexMeshGenerator, possible overflow #25070

Open olinwc opened 1 year ago

olinwc commented 1 year ago

Bug Description

When using PatternedHexMeshGenerator incorrect block ids are assigned if the block id is above a certain value for the meshes defined in inputs. If a block_id is too large in an input mesh then it seems the next highest available block id is assigned.

Steps to Reproduce

Changing to_delete_block_id from 100 to 1000 results in the error. Instead of the to_delete block id existing in fuel_assembly_with_rsc mesh, it will instead only have graphite_quad.

bug_example.txt

How the mesh should look

correct_block_ids

How it looks when the bug occurs

incorrect_block_ids

Impact

An annoyance once I figured out what the problem was.

lindsayad commented 1 year ago

Pinging our reactor codeowners, @loganharbour @GiudGiud @roystgnr

GiudGiud commented 1 year ago

Is it me or it s not the first time @olinwc has overflowed these ids haha

I think the fix here is to provide stricter parameter checking on SubdomainID-type (and all other overflow prone types!) parameters. This can be done on the input parameter parsing side iirc.

olinwc commented 1 year ago

Is it me or it s not the first time @olinwc has overflowed these ids haha

Unironically I think this is probably the fourth time I've managed to overflow ids somewhere.

roystgnr commented 1 year ago

This isn't an overflow.

libMesh defaults to unsigned 16-bit subdomain ids (with unsigned(-1) reserved), and IIRC MOOSE doesn't change that, but even that should support ids up to 65534; it should have no problem whatsoever with 1000. libMesh in METHOD=dbg has a ton of overflow checking (although not enough; C/C++ is annoyingly lax about implicit conversion between integer types, so we have to hunt for those and manually replace them with tested explicit conversions...), and none of it's triggering here. I was hoping this would quickly trigger some assertion failure in dbg mode and we'd just need to enable a test upstream of that in opt mode, but no.

Is some mesh generator using 1000 as a magic number, so then trying to specify it for a different subdomain as a user is conflicting with that? If I try to_delete_block_id=999, it works. If I try to_delete_block_id=1001, then to_delete again doesn't exist in the output, but now the elements that should have been part of it are part of helium_quad instead of graphite_quad. With to_delete_block_id=1002, the input file works again.

There's got to be some place where we can be looking for id conflicts but we aren't yet. I'll see what I can find.

roystgnr commented 1 year ago

There's got to be some place where we can be looking for id conflicts but we aren't yet. I'll see what I can find.

It's probably not sufficient for all cases, but doing remapping (where conflicting ids have independent names to prove an intent for independent subdomains) or screaming and dying (where we don't) in stitch_meshes seems to be a good start. I'll get a PR into libMesh shortly. I think I'll try enabling the sanity-checking by default; we'll have an option to get the old behavior but hopefully we won't have to futz with too many CI tests that inadvertently depended on it.

roystgnr commented 1 year ago

hopefully we won't have to futz with too many CI tests that inadvertently depended on it.

Famous last words. I'm getting 14 CI failures on my workstation from the reactor module alone.

On the other hand, maybe that's evidence that the sanity-checking really needs to be on by default, even if we have a bunch of work to do to get CI ready for that.