Closed olinwc closed 1 year ago
@roystgnr is this something you might be able to look into?
Here's the output when running ./reactor-devel -i ../../../gHPMR_2d_core_v29.i --mesh-only
Setting Up.
*** ERROR ***
Assertion `it->second == node' failed.
it->second = 0x60000cb8b120
node = 0x60000cbbc660
Stack frames: 14
0: 0 libmesh_devel.0.dylib 0x0000000110549eec libMesh::print_trace(std::__1::basic_ostream<char, std::__1::char_traits<char> >&) + 1148
1: 1 libmesh_devel.0.dylib 0x0000000110546683 libMesh::MacroFunctions::report_error(char const*, int, char const*, char const*, std::__1::basic_ostream<char, std::__1::char_traits<char> >&) + 179
2: 2 libmesh_devel.0.dylib 0x0000000110ede839 libMesh::Poly2TriTriangulator::triangulate_current_points() + 3625
3: 3 libmesh_devel.0.dylib 0x0000000110eeaa98 libMesh::Poly2TriTriangulator::triangulate() + 136
4: 4 libmoose-devel.0.dylib 0x000000010e0707ec XYDelaunayGenerator::generate() + 2380
5: 5 libmoose-devel.0.dylib 0x000000010e131291 MeshGenerator::generateInternal() + 1313
6: 6 libmoose-devel.0.dylib 0x000000010ef2cd5a MeshGeneratorSystem::ex [ 10.47 s] [ 119 MB]
ecuteMeshGenerators() + 1658
7: 7 libmoose-devel.0.dylib 0x000000010e8c6013 Action::timedAct() + 83
8: 8 libmoose-devel.0.dylib 0x000000010e8c59c9 ActionWarehouse::executeActionsWithAction(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) + 185
9: 9 libmoose-devel.0.dylib 0x000000010e8fe248 ActionWarehouse::executeAllActions() + 264
10: 10 libmoose-devel.0.dylib 0x000000010ef9b525 MooseApp::runInputFile() + 117
11: 11 libmoose-devel.0.dylib 0x000000010ef95870 MooseApp::run() + 912
12: 12 reactor-devel 0x000000010a450bcb main + 139
13: 13 dyld 0x000000010b31552e start + 462
[0] ../src/mesh/poly2tri_triangulator.C, line 396, compiled May 16 2023 at 01:06:12
Stack frames: 6
0: 0 libmesh_devel.0.dylib 0x0000000110549eec libMesh::print_trace(std::__1::basic_ostream<char, std::__1::char_traits<char> >&) + 1148
1: 1 libmoose-devel.0.dylib 0x000000010efcdb30 moose::internal::mooseErrorRaw(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >) + 784
2: 2 libmoose-devel.0.dylib 0x000000010e6eb217 void mooseError<char const*>(char const*&&) + 247
3: 3 libmoose-devel.0.dylib 0x000000010ef96a26 MooseApp::run() + 5446
4: 4 reactor-devel 0x000000010a450bcb main + 139
5: 5 dyld 0x000000010b31552e start + 462
application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0
[unset]: write_line error; fd=-1 buf=:cmd=abort exitcode=1
:
system msg for write_line failure : Bad file descriptor
@roystgnr is this something you might be able to look into?
If I can reproduce it, at least. I'll try that now.
Here's the first file I started hitting the failure with. On line 393 I have a pattern which runs with no assertion hit and then on line 408 the specified pattern fails when I change a single assembly in PatternedHexMeshGenerator
Score! I can reproduce this with v29 ... though not with v18 for some reason.
This is the MOOSE commit I hit it with
commit e0b3d67f8489c7e79339b532606fe884756545a1 (HEAD)
Merge: 42342ed08d 395ab6da47
Author: moosetest <bounces@inl.gov>
Date: Mon May 22 22:33:58 2023 -0600
Merge commit '395ab6da47fe9a63aab358a9ca122e2fd6dd943d'
Seeing if I can pare this down as much as possible ... removing control drums 5,6,2,3 gets me the same error ... removing 1,4 too gets me a different assertion failure.
Hey, what's going on with y1
vs y2
in outer_core_ring
? (I'm like 99.99% sure this is unrelated to the failure; I'm just confused by it)
I saw somebody make a circle with ParsedCurveGenerator
once doing that so I just copied them to make a circle
I used this example, which was close to a circle, and figured I could do the same thing to make a circle. Listing 2 in Example Syntax for https://mooseframework.inl.gov/source/meshgenerators/ParsedCurveGenerator.html
[pcg]
type = ParsedCurveGenerator
x_formula = '1.0*cos(t)'
y_formula = 'y1:=1.0*sin(t);
y2:=1.5*sin(t);
if(t<pi,y1,y2)'
section_bounding_t_values = '0.0 ${fparse pi} ${fparse 2.0*pi}'
nums_segments = '10 10'
constant_names = 'pi'
constant_expressions = '${fparse pi}'
is_closed_loop = true
[]
Thanks for the explanation! They have that extra trickiness because of the jump from 1.0 to 1.5; you could have just gone down to y_formula = '${outer_core_radius}*sin(t)'
since you don't have a jump. No shame, though; "find someone else who did it right and tweak their thing" is definitely the right way to get started.
Running v18 on Sawtooth in devel mode I also don't hit the assertion.
Do you still hit it on Sawtooth with v29?
Just tested it and I do
Let me test some more of my intermediate versions on Sawtooth and see what hits it there
V21 runs fine
V25 fails
V23 fails
And V22 was just a small control rod test, which passes
V21, which passes gHPMR_2d_core_v21.txt
V23, which fails gHPMR_2d_core_v23.txt
Unfortunately, the difference between V21 and V23 is adding the control drums, which it looks like you've already determined is the issue
Nope. I've gotten rid of the control drums entirely and managed to retain the failure.
The same failure, even; it cropped up again as I pared down the problem further.
Well this is interesting, I modified V21 to shrink the outer ring because I remember having problems with shrinking that ring, and I get a different assertion
Setting Up [ 8.61 s] [ 120 MB]
*** ERROR ***
Assertion `new_first != new_second' failed.
new_first = 2367
new_second = 2367
Stack frames: 16
0: libMesh::print_trace(std::ostream&)
1: libMesh::MacroFunctions::report_error(char const*, int, char const*, char const*, std::ostream&)
2: /scratch/calvolin/sawtooth/griffin/moose/scripts/../libmesh/installed/lib/libmesh_devel.so.0(+0xe83476) [0x7fffe980b476]
3: libMesh::Poly2TriTriangulator::insert_refinement_points()
4: libMesh::Poly2TriTriangulator::triangulate()
5: XYDelaunayGenerator::generate()
6: MeshGenerator::generateInternal()
7: MeshGeneratorSystem::executeMeshGenerators()
8: Action::timedAct()
9: ActionWarehouse::executeActionsWithAction(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)
10: ActionWarehouse::executeAllActions()
11: MooseApp::runInputFile()
12: MooseApp::run()
13: main
14: __libc_start_main
15: ./reactor-devel() [0x402945]
[0] ../src/mesh/poly2tri_triangulator.C, line 957, compiled May 24 2023 at 16:30:35
Stack frames: 7
0: libMesh::print_trace(std::ostream&)
1: moose::internal::mooseErrorRaw(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)
2: void mooseError<char const*>(char const*&&)
3: /scratch/calvolin/sawtooth/griffin/moose/framework/libmoose-devel.so.0(+0xb6caee) [0x7fffeb58faee]
4: main
5: __libc_start_main
6: ./reactor-devel() [0x402945]
[sawtooth1:mpi_rank_0][MPIDI_CH3_Abort] application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0: No such file or directory (2)
Yeah, that's the other I've been able to hit.
I'm well out of time for tonight. I've got a failure case down to 70 lines and a much smaller mesh. roy_v20.txt
I don't actually understand some of the things I can't remove, though. What do num_sectors_per_side
and num_sectors_per_side_meta
do? From https://mooseframework.inl.gov/source/meshgenerators/HexagonConcentricCircleAdaptiveBoundaryMeshGenerator.html I see that the former will be overridden, but I still have to provide it?
Well, even simpler would have been even better, but this is looking comprehensible enough in the debugger.
At the simplest failure I can trigger, I'm seeing triangles with inverted orientations, which would definitely explain why refining those triangulations isn't working ... but even if inputs aren't being consistent about boundary orientation, the triangulator code is supposed to be detecting and correcting any incompatibility. Still trying to figure out why it's not doing so here.
@roystgnr Regarding num_sectors_per_side_meta
that's needed to add the metadata for "irregular" hex cells added in PatternedHexMeshGenerator
. Members of the reactor team can explain exactly why that metadata is required better than me.
I don't think I specify any sides to adapt HexagonConcentricCircleAdaptiveBoundaryMeshGenerator
to so num_sectors_per_side
still defines the initial number of sectors in the hexagons, but I end up deleting this interior and remeshing/refining the boundary with XYDG, which is why that sector definition doesn't actually "matter" in the final mesh.
I guess if I defined a really large number of sectors per side, like 100, it would be more refined than what the XYDG refinement generates.
I was triggering inverted elements in cases where I forced sliver elements (by disabling outer boundary refinement but still requiring desired_area=0.93
); without that, I can still get an error but it's now "Opposing point on constrained edge" from inside poly2tri.
On the bright side, I've ruled out a whole bunch of issues, and I'm down to 29 lines on the failing test case. roy_v22.txt
And now I'm down to 7 elements on the unrefined failure case; getting rid of the angular offset in inner_core_ring
, I can bring that down to 3 sides and outer_core_ring
down to 4, and I still see (sufficient amounts of) refinement trying to create sliver elements. This is also clearly narrowing down the bug to "definitely in my code", so there's no longer any risk of me coming back ranting about how many days I wasted on an input that couldn't be meshed without FPU-defying sliver elements; the slivers here are definitely an artifact of something wrong in the code, not the input file.
I would definitely have felt bad (and taken responsibility) if that was the case
Eh, even if it had been a bad input file, it would have been the sort of bad input file that XYDelaunay is supposed to be sanity checking. But we have a known bug in the sanity checking; I was kind of hoping that we'd slipped past another of those, not a bug in the actual triangulation code.
This is definitely some problem with sliver elements. I can even trigger a failure with no holes, if I just disable boundary refinement and push interior refinement hard enough. I'd noticed that in the past, but I'd thought it was happening with triangle aspect ratios like 10^6, not aspect ratios like 10. And I'd never managed to trigger it with the early stages of a triangulation refinement before, but the combination of small boundary/hole segments and large domains seems to be able to do so.
Not sure I'll have it fixed today, though. Feeling sick, might have to knock off early.
The failure case I've boiled down to is making more and more sense. It's not that the aspect ratios are ~10 and then it fails, the aspect ratios are ~10 when we're pushed hard enough to hit the failure case: we see an element in need of refinement, we do the Standard Delaunay Thing and try to add its circumcenter as a new node, but the circumcenter falls outside an boundary with refinement disabled so as a fallback we add the centroid of that boundary element instead ... and that works well enough once or twice, but at some point refining the boundary element's Delaunay cavity no longer naturally leads to refinement of the desired element, we loop until we have crazy 10^6 scale slivered elements, and either poly2tri or libMesh or both fail via FP precision issues.
This might be as simple as having a safer fallback, then. Fingers crossed.
Not that simple. Safer fallback fixed my simplest failure case completely, but isn't sufficient for the v29 reactor mesh. I'll "bisect" again and keep going from here, though.
roy_v20.txt
still fails with the new fallback; roy_v22.txt
succeeds now.
I may be also creating sliver elements inadvertently when circumcenters fall just inside a boundary? ... which is easy enough to do when you combine interesting regular geometries with slight perturbations. If that's the case then I need an extra test. Look at each boundary of a new cavity element, and if the area coordinate of a new Steiner point with respect to an unrefinable boundary is too small, increase it? (and if the area coordinate of a new Steiner point with respect to a refinable boundary is too small, snap it to the boundary?)
I'm about done for the day though so I'll have to take a crack next week.
Thanks for taking the time to look into this @roystgnr ! Tag @jortensi since they are interested in this issue also
Here is the latest version of the core I'm using, I coarsened the area meshed by XYDG to see how low I could get the number of elements. It still fails in devel
in case you want to test this with whatever changes you are making. I will be out on PL all of next week and will be back in office on 6/5
gHPMR_2d_core_v33.txt
And ... I already had that extra test (though computed via cosines rather than area coordinates).
Once that red herring was out of the way I found some too-high-floating-point-tolerance issues, and tightening up those gets some more of my test cases working, but roy_v20.txt is still failing.
What the hell. Poly2Tri occasionally gives us a non-Delaunay triangulation?
And if I manually fix non-Delaunay triangulations, then OH MY GOD IT WORKS. Including on gHPMR_2d_core_v29.txt.
Not sure how long this'll take to clean up (I'll get everything into a "roys_junk202305" branch right away lest I regress it again while refactoring...) and downstream (just fixing up the input deck sanity checking seems like it'll require a Sockeye test suite fix...), but I'll get started now.
@roystgnr just wanted to let you know I am back at work if you need anything from me, really appreciate all the time you've spent working on fixing this!
Could you try out libMesh/libmesh#3572 on more of your own cases? It'll be a while before it's downstream (probably next week) and I'd hate to get the libMesh submodule updated only to discover that there were still further fixes necessary.
Will be resolved in MOOSE once https://github.com/idaholab/moose/pull/24735 is merged.
Bug Description
The attached mesh input file runs without errors in
opt
and (appears) to generate a valid exodus file which can be viewed in Paraview, but when running indevel
mode assertions are hit. This on my local Mac Pro.Steps to Reproduce
Run the attached input file with
--mesh-only
and try to run inopt
anddevel
modeImpact
I haven't tried actually using this mesh yet in a problem to see what happens, but the error in
devel
mode does not inspire confidence. Also have been hitting some other issues with the mesh sometimes hanging, but I can't debug because it won't run indevel
mode.Input File gHPMR_2d_core_v29.txt