Open KennethEJansen opened 1 year ago
Note this discussion started in another #393 but is now migrated to here.
@jedbrown (that is not completing so I am not sure how Jed was on the earlier CGNS ticket but not here) contributed this picture in CNDA Slack and in the prior CGNS issue #296 that has been closed. Fromcgnsview
With the following description (that I will lace my questions into as I try to collect the data to throw at these nodes, I am looking for a bit more detail about what is needed in each node/leaf).
One contiguous GridCoordinates, which gives a unique global ID to each node clear
TetElements followed by PyramidElements followed by PrismElements cover the volume. _There are two entries here. ElementRange and ElementConnectivity. The former seems to need to be continuous, starting from from 1 for a given zone and continue as you move to a new topology of a given zone. Obviously we can can create this numbering but I don't think SCOREC/core will spit them out of a single iterator like this but with multiple iterators and a conditional and counter it should not be hard. As I look at the documentation, it shows ElementType_t between ElementRange and ElementConnectivity. Does cgnsview just not show this or, it it promoting that to the foldername? I think no because it carries number of nodes (e.g., Hexa27). ElementConnectivity is a table (all nodes of an element per row).
This example has 3 element groups in a single zone: PyramidElements, TetElements, and Prism Elements
What FSFBrackets are right after PrismElements and before ZoneBC? Is this just a boundary element mesh set that got written before before the ZoneBC after which the rest? Was this by design or a glitch to the organization? ZoneBC is next but the description skips to...
Each semantic surface (like Nacelle here) has a Elements_t node at the same level (under blk-1) with the element connectivity. This is a Pointwise convention, and it would also be possible to lump all the triangle faces into a single file node. If I understand this correctly, these are a single element group for each model face following the same convention as the volume elements above. Specifically, I think the first set will range from 1 to the number of mesh faces on the first model face, followed by ElementType, followed by the connectivity table.
Under ZoneBC, there is a BC_t node for each semantic surface, pointing to the element range (which is contiguous because of the Pointwise convention, but need not be) Do we need these or are we going to derive all boundary conditions from model face mesh sets in the prior paragraph?
For each BC, there is also a Family_t node under Base that can offer more semantic information about that boundary type. The screenshot here doesn't use that in a meaningful way. Same question?
Note that Symmetry contains both triangles and quads, so there are two Elements_t nodes in the file, tri_Symmetry and quad_Symmetry, with element numbering contiguous across the two. The corresponding BC_t Symmetry node refers to the contiguous range over both face topologies. Clear.
Note that you generally don't manage writing these file nodes manually, but rather through the Boundary Condition interfaces. @jrwrigh we should find a time to discuss this. I am currently reviewing what chef does to find the places we can extract this data from the mds database but I assume to use these functions we will have to put it into the specific data structures that these interfaces require. If we get that right your work might be pretty easy.
To answer your question, yes this is more of the "mesh sets" model. In PETSc, we'll create the volume elements and interpolate the mesh (create the faces and edges), then broker identification of the faces through a vertex (to keep communication structured and scalable) so we can add them to the appropriate Face Sets. OK my largest question is whether we will or will not need ZoneBC information that I have skipped above (thinking we don't need it). Second question is how we plan to handle periodicity. I think Jed discussed that somewhere so I will find that and surface that in another comment.
Notes: (see CPEX 0031) The use of ElementList and ElementRange for ptset_type is deprecated and should not be used in new code for writing boundary conditions. These are still currently accepted, but will be internally replaced with the appropriate values of PointList/PointRange and GridLocation_t, based on the base cell dimension. Code which reads older CGNS files, should handle ElementList and ElementRange, however, since many older files contain these specifications for ptset_type. Is from the link @jedbrown gave to Boundary Condition interfaces. It looks like they don't want boundary conditions to be given in terms of face sets which is unfortunate in my opinion. Boundary conditions are attributes of model faces. Mesh "nodes" on model edges and model vertices are best resolved by inheritance rules from the intersection of faces. Obviously we can do that work on the SCOREC core side and produce node lists for each boundary condition but I am not clear yet whether PETSc wants these node lists or wants to do those inheritance rules and intersection from faces lists.
Weak BCs certainly belong on faces. Are they suggesting that these also should be given as node lists?
Jed said:
Regarding BCs,
GridLocation_t
can take the valueFaceCenter
, so this is just a naming thing (preferring to call itPointList
withFaceCenter
instead ofElementList
). Notable that Pointwise is still creating files with the old convention.
Jed said (much more)
As I look at the documentation, it shows ElementType_t between ElementRange and ElementConnectivity. Does cgnsview just not show this or, it it promoting that to the foldername?
You'll see it if you click on the "folder" nodes. The 10
appearing on the right is the enum value from the source code
typedef enum {
CGNS_ENUMV( ElementTypeNull ) =CG_Null,
CGNS_ENUMV( ElementTypeUserDefined ) =CG_UserDefined,
CGNS_ENUMV( NODE ) =2,
CGNS_ENUMV( BAR_2 ) =3,
CGNS_ENUMV( BAR_3 ) =4,
CGNS_ENUMV( TRI_3 ) =5,
CGNS_ENUMV( TRI_6 ) =6,
CGNS_ENUMV( QUAD_4 ) =7,
CGNS_ENUMV( QUAD_8 ) =8,
CGNS_ENUMV( QUAD_9 ) =9,
CGNS_ENUMV( TETRA_4 ) =10,
CGNS_ENUMV( TETRA_10 ) =11,
CGNS_ENUMV( PYRA_5 ) =12,
CGNS_ENUMV( PYRA_14 ) =13,
CGNS_ENUMV( PENTA_6 ) =14,
CGNS_ENUMV( PENTA_15 ) =15,
CGNS_ENUMV( PENTA_18 ) =16,
CGNS_ENUMV( HEXA_8 ) =17,
[...]
What FSFBrackets are right after PrismElements and before ZoneBC?
The ordering in cgnsview
is arbitrary from what I can tell. It's just a surface with a few hundred triangles, and has a corresponding entry in ZoneBC
.
Each semantic surface (like Nacelle here) has a Elements_t node at the same level (under blk-1) with the element connectivity. This is a Pointwise convention, and it would also be possible to lump all the triangle faces into a single file node. If I understand this correctly, these are a single element group for each model face following the same convention as the volume elements above. Specifically, I think the first set will range from 1 to the number of mesh faces on the first model face, followed by ElementType, followed by the connectivity table.
Mostly, though usually one writes the volume elements first so those start at 1
and the face numbers start after that. If you click around in this file, the element blocks are numbered sequentially with tet, pyramid, prism, then FSFBrackets (triangles), etc. The ordering here is arbitrary. Note that usually the mapping to file schema is handled by the library (we won't be reading/writing raw HDF5, but rather calling the CGNS interfaces that create these linked nodes). But I've certainly found it useful to explore using cgnsview
to see how a file was intended to be used.
Under ZoneBC, there is a BC_t node for each semantic surface, pointing to the element range (which is contiguous because of the Pointwise convention, but need not be) Do we need these or are we going to derive all boundary conditions from model face mesh sets in the prior paragraph?
They're supposed to be present and will be written when using the API. Here's an example from the test suite (includes multi-zone unstructured)
For each BC, there is also a Family_t node under Base that can offer more semantic information about that boundary type. The screenshot here doesn't use that in a meaningful way. Same question?
This appears optional, and is not written in the unstructured test. I haven't seen an example in which it's used for something I can recognize as important.
OK my largest question is whether we will or will not need ZoneBC information that I have skipped above (thinking we don't need it).
We want that because it's the way BCs are identified. It is automatically written by the interface. If for some reason we need naming heuristics instead, we could handle that, but we may as well write compliant files.
Second question is how we plan to handle periodicity. I think Jed discussed that somewhere so I will find that and surface that in another comment.
There is cg_conn_write
. The docs (excerpt quoted below) describes how one can either match vertices to donor vertices or faces to donor faces. Note that CGNS vertices are all the nodes while PETSc vertices are just corners (of arbitrary order parametric elements). So my preference is to describe periodicity in terms of faces, which can be directly fed into the isoperiodic support. Life will be easiest if those face elements are oriented the same way on the periodic and donor surfaces.
For Abutting or Abutting1to1 interfaces, GridLocation can be either Vertex or FaceCenter. When GridLocation is set to Vertex, then PointList or PointRange refer to node indices, for both structured and unstructured grids. When GridLocation is set to FaceCenter, then PointList or PointRange refer to face elements. Face elements are indexed using different methods depending if the zone is structured or unstructured.
Thanks @jedbrown this is much more clear now. Thanks also to @jrwrigh who explained quite a bit of this to me when we met on Tuesday to discuss this. As you can see @cwsmith , I have made PR out of my WIP branch. Clearly all the real development described in the PR description still needs to be "done" but I hope you don't mind that we use the PR review utility as a communication platform to plan, accelerate, and document our progress. @cwsmith, I think I have copied everything from the prior ticket so if you want to delete those comments to keep the PR short, sweet, and focused just on the subject (relaxing/removing the requirement for C++14 in the CGNS development that then exposed SCOREC/core code that was not C++ 14 compliant). I think I have merged those changes into this branch which does build a chef executable with CGNS enabled (which of course is necessary for this PR's development).
DEVELOPMENT ROADMAP
Note, I am not super clear on point 3. @jedbrown How does CEED-PHASTA run continuation work? Are files geometry and boundary condition files written at the end of a run that allow continuation of that run with the exiting partition or, are we currently pulling the solution back to a serial CGNS file for visualization and thus needing work (and a plan) to save more data (in parallel?) to be used as a checkpoint that avoids redoing the partition and boundary condition creation?
On the last point, it will be possible to read any cgns checkpoint as mesh and initial condition. If the number of ranks matches, we can continue with the same partition, otherwise we'll repartition and continue.
Life will be easiest if those face elements are oriented the same way on the periodic and donor surfaces. I see three issues here to clarify completely what
oriented the same way
means:
@jedbrown, I ask this question because, historically, for PHASTA, we have always created volume elements for the boundary elements with the first n_face_verts
ordered as they should be for that volume element according to PHASTA's convention (and shape function definition). This has been complicated (and there are appropriately brutal complaints in the comments for the fact) by tet elements being negative volume oriented (curling the tri face always points away from the opposing vertex) while all others follow the typical FE convention of positive elements (curl of first two edges points to opposing face).
I guess this whole comment can be TLD'R to the following questions for @jedbrown:
On the last point, it will be possible to read any cgns checkpoint as mesh and initial condition. If the number of ranks matches, we can continue with the same partition, otherwise we'll repartition and continue.
Just to be possibly pedantically clear, does mesh
in this phrase include boundary condition information (the boundary element sets and ZoneBC) with some concept of the partition used in the simulation that wrote the checkpoint such that, when continuing at the same part count, it is mostly reload and go (without redoing rendevous work that I assume will have to be done the first time PETSc loads the serial or parallel CGNS mesh to start the simulation)?
After meeting with @cwsmith and @jrwrigh this morning, and then thinking a bit about it during my bike ride in to campus it occurs to me that we do have another route that we could consider. I had dismissed it earlier for a variety of reasons but I guess I want to air it out here for support/criticism.
The idea is to write let SCOREC-core/chef
stay limited to writing posix PHASTA files (which means one file per part) and write a code that reads those files and writes a CGNS file. The upside is that all the CGNS facing code could be fortran based (allowing us to follow their examples exactly and not map all of it to C++ which I think @jrwrigh and I are less than strong in).
The downside is that I am less than sure of what CGNS wants us to "write" into a file to express parallel.
This is also what is paralyzing my decision to update the CGNS writer that is already in SCOREC-core since it hard for me to design it without a vision of how it gets written in parallel. Maybe I need to say a little more about how the data is laid out for @jedbrown to help me clear that mental/writers block.
After partitioning the MDS database, there is a distinct set of elements on each part and they retain classification against a real geometric model (if it exists) or a topological model if it does not. What chef
the pre-processor for PHASTA does is to have each part create a connectivity for the volume elements on each part and a distinct connectivity for the boundary elements on each part of each topology. This last part is a little subtle in that PHASTA wants volume elements on the boundary so this is actually a block of elements for each volume-element-face-type topology on this process/part (yes chef is limited to run on the same number of processes as their are parts). If I have an all-tet mesh there will only be tri-faces on the boundary but if my part has elements on two or more model faces (e.g., if in a corner) all those tets will be in one element connectivity. This is easily reconciled to sort to particular boundaries because we can give each model face a surfID attribute and each of these elements will then have this array to sort it to what CGNS wants. For complete clarity, if a volume element has faces on 2 or 3 boundaries it will appear 2 or 3 times respectively each time with the ordering permuted such that the first 3 nodes are on the model face. All wedge meshes are handled similarly with tris bumping to quads. All wedge meshes bring the added complexity of having quads on some faces (first 4 of the 6 nodes permuted to be on the boundary) and some quad faces but, in most usages a face will be all quad or all tri so not that complex. Mixed meshes could have multiple blocks of elements even of the same face shape on a given model face but chef would put these into different blocks (e.g., tris from tets would be in a different block than tris from wedges or pyramids; quads from hexes would be in a different block from quads from wedges or pyramids). I mention pyramids but in truth, a good mesh seldom has pyramids that have a face on the boundary.
@cwsmith pointed out that it is not so simple to have O(10k) processes write their "chunk" of:
So, at long last my question: given the above data layout on say 10-100k ranks, can the provided CGNS library file writers work with the data distributed among ranks like this (assuming we have filled the arrays we pass to it with the correct global numbers for that rank's data?
Parsing that question into each category:
call cg_coord_write_f(index_file,index_base,index_zone,RealDouble, 'CoordinateX',x,index_coord,ier)
on each of the 10k ranks result in a correct coordinates file to be loaded on 10k ranks of PETSc?call cg_section_write_f(index_file,index_base,index_zone, 'Elem',HEXA_8,nelem_start,nelem_end,nbdyelem,ielem, index_section,ier)
on 10k ranks result in a correct connectivity file to be loaded on 10k ranks of PETSc.OR is it on us to do the parallel write.
I guess that is my real question. Are those CGNS writers going to handle the parallel writing leaving our job to be only getting the data from rank-local numbering to the linearly increasing numbering? If those calls to cgns "handle" the parallel writing for us, I already have a lot of that local to global code written in the version of PHASTA that uses PETSc (because PETSc requires it) which means that I might get this done faster by writing a fortran code that:
Obviously a strong C++ coder could do the same within chef but I am not sure we have one of those. Further, obviously the serial version of this code is pretty simple AND follows the examples given here.
I'm not sure if I'm reading your whole post correctly, but this is how we do it in PETSc. Each rank creates a connectivity array for the elements it owns, and the collective write is done in these two lines (the first is metadata, the second is array data). https://gitlab.com/petsc/petsc/-/blob/main/src/dm/impls/plex/cgns/plexcgns2.c?ref_type=heads#L791-792
The vertex numbers are with respect to the node numbering used for GridCoordinates
, which use a similar two-stage write here.
@jedbrown thanks for the links to how PETSc sets up these files and writes data. For now I have abandoned the clunky idea of making a post-chef converter from posix to CGNS and returned to this branch to attempt to get these files written directly from Chef so we (@jrwrigh and I) should be able to mimic the PETSc writing closely. In the commit directly above, I believe I have the functions added to get the data for region connectivity and a surface connectivity in the form needed. Further, I think the coordinates are collected into an array that matches the expectations of CGNS (all x, then all y then all z though I am not sure why somone put a comment saying FORTRAN indexing??? I suppose they mean to say flat array that runs over points with the first linear index first and the dim of point as the second index not 1 bases as we or at least I usually think of as Fortran indexing. C++ centric folks don't always appreciate the value of flat arrays over arrays of arrays.)
static void getCoordinates(Output& o)
{
apf::Mesh* m = o.mesh;
int n = m->count(0);
double* x = new double[n * 3];
apf::MeshEntity* v;
int i = 0;
apf::MeshIterator* it = m->begin(0);
while ((v = m->iterate(it))) {
apf::Vector3 p;
m->getPoint(v, 0, p);
for (int j = 0; j < 3; ++j)
x[j * n + i] = p[j]; /* FORTRAN indexing */
++i;
}
m->end(it);
PCU_ALWAYS_ASSERT(i == n);
o.arrays.coordinates = x;
}
so it should be simple (serial for now) to throw three continuous slices of this array at the CGNS writer. The o
data structure is passed to the retained but not updated writer function at the bottom of phCGNSgbc.cc
though I guess if we don't want to work with that we could just inline this code and pass ranges of x to CGNS. That might be the best approach anyway since we are going to have to compact and map this numbering for parallel anyway.
@jrwrigh do you want to take a cut at the CGNS file write stuff (set up for parallel since that is surely the way we will go) that so that we can get a serial test of coordinates + volume connectivity and then add the boundary element lists? That would free me to work on creating the map from on-rank numbering to global numbering so that we could then follow up that test with a parallel test soon? I am available to help when if that is needed.
I think I have all the data in flat 1-D arrays for coordinates, volume connectivity, and boundary element connectivity to be chucked into a CGNS file in a parallel way. I am pretty terrible at function interfaces so I pretty much pushed everything that needs to be shared into the output data structure (added stuff in phasta/phOutput.h
). I lifted the complicated code from a PHASTA routine to find the mapping of PHASTA on rank numbering to PETSc global numbering based on on-rank-owned number + sum of lower ranks owned number (linear global numbering with rank) which I interpreted CGNS to want as well. That is by far the most complicated part of this development and thus, assuming I did not introduce bugs as I changed its interface (which were the only intended changes) I am hopeful this does not have too many bugs.
Work left to do:
1) Add the code to do the CGNS writes
2) Add the code/flags to tell chef
we want CGNS writes
3) At that point we can create some test geometries and look at results in PV.
4) I will set up those test cases with surfIDs
numbers that match our 1-6 so that the next step can be to pluck out the boundary element numbers for each "type" of boundary condition. I am thinking that ZonalBC coding will be easy if I provide an array of the same size as the number of boundary elements to make that relatively painless.
I will likely plug away at 2-4.
As I start to add the calls to CGNS to write the CGNS file, I am noticing some places to easily get confused:
writeInts(f, " mode number map from partition to global", o.arrays.globalNodeNumbers, m->count(0));
. This is a map back to the ORIGINAL global numbering and not what CGNS/PETSc want (linearly increasing in rank ownership). We should consider a chef flag to choose one or the other (or both) to write into PHASTA files to save doing this work in PHASTA at the start of every run that uses the PETSc solver. phGeomBC.cc
petsc/src/sys/classes/viewer/impls/cgns/cgnsv.c
[ 89%] Building CXX object phasta/CMakeFiles/ph.dir/phCGNSgbc.cc.o
/projects/tools/SCOREC-core/core/phasta/phCGNSgbc.cc: In function 'void ph::writeCGNS(ph::Output&, std::string)':
/projects/tools/SCOREC-core/core/phasta/phCGNSgbc.cc:311:26: warning: ISO C++ forbids converting a string constant to 'char*' [-Wwrite-strings]
311 | static char *outfile = "chefOut.cgns";
@matthb2 I am finally doing something we talked about long ago....making an alternative for syncIO. @jedbrown convinced me that CGNS was the least bad "standard" in terms of handling higher order. This PR is now on the edge of testing at least writing coordinates and volume connectivity from chef but I have hit a snag in that I suspect my CGNS is living in 32 bit integer fantasy land. While this is my problem to solve, I am hoping that, since I am using the Spack that you built up on the viz nodes (because that had CGNS and HDF5) that maybe you can give me some advice/help?
Ideally, if you can get an cgsize_t set to long long int CGNS built up in that Spack repo (or in another) that would help me at least determine if SCOREC's gcorp_t type for the same thing would get past my current compile fail which looks like this:
Consolidate compiler generated dependencies of target ph
[ 87%] Built target crv
[ 89%] Building CXX object phasta/CMakeFiles/ph.dir/phCGNSgbc.cc.o
/projects/tools/SCOREC-core/core/phasta/phCGNSgbc.cc: In function 'void ph::writeBlocksCGNS(int, int, int, ph::Output&)':
/projects/tools/SCOREC-core/core/phasta/phCGNSgbc.cc:288:17: error: invalid conversion from 'cgsize_t*' {aka 'int*'} to 'gcorp_t' {aka 'long long int'} [-fpermissive]
288 | gcorp_t e = (cgsize_t *)malloc(nvert * e_owned * sizeof(cgsize_t));
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
| |
| cgsize_t* {aka int*}
/projects/tools/SCOREC-core/core/phasta/phCGNSgbc.cc:298:61: error: invalid conversion from 'gcorp_t' {aka 'long long int'} to 'const cgsize_t*' {aka 'const int*'} [-fpermissive]
298 | if (cgp_elements_write_data(F, B, Z, E, e_start, e_end, e))
| ^
| |
| gcorp_t {aka long long int}
In file included from /projects/tools/SCOREC-core/core/phasta/phCGNSgbc.cc:14:
/usr/local/spack/v0.16/opt/spack/linux-debian8-nehalem/gcc-11.2.0/cgns-4.2.0-t6tbfmujxnjuya77iene7uuhknkyjoei/include/pcgnslib.h:92:51: note: initializing argument 7 of 'int cgp_elements_write_data(int, int, int, int, cgsize_t, cgsize_t, const cgsize_t*)'
92 | cgsize_t start, cgsize_t end, const cgsize_t *elements);
| ~~~~~~~~~~~~~~~~^~~~~~~~
/projects/tools/SCOREC-core/core/phasta/phCGNSgbc.cc:300:10: error: invalid conversion from 'gcorp_t' {aka 'long long int'} to 'void*' [-fpermissive]
300 | free(e);
| ^
| |
| gcorp_t {aka long long int}
In file included from /usr/local/spack/v0.16/opt/spack/linux-debian8-nehalem/gcc-4.9.2/gcc-11.2.0-te7qrdtsvnii4dc32rg2z7oseu3b2c7s/include/c++/11.2.0/cstdlib:75,
from /usr/local/spack/v0.16/opt/spack/linux-debian8-nehalem/gcc-4.9.2/gcc-11.2.0-te7qrdtsvnii4dc32rg2z7oseu3b2c7s/include/c++/11.2.0/ext/string_conversions.h:41,
from /usr/local/spack/v0.16/opt/spack/linux-debian8-nehalem/gcc-4.9.2/gcc-11.2.0-te7qrdtsvnii4dc32rg2z7oseu3b2c7s/include/c++/11.2.0/bits/basic_string.h:6607,
from /usr/local/spack/v0.16/opt/spack/linux-debian8-nehalem/gcc-4.9.2/gcc-11.2.0-te7qrdtsvnii4dc32rg2z7oseu3b2c7s/include/c++/11.2.0/string:55,
from /projects/tools/SCOREC-core/core/phasta/phInput.h:13,
from /projects/tools/SCOREC-core/core/phasta/phOutput.h:4,
from /projects/tools/SCOREC-core/core/phasta/phCGNSgbc.cc:2:
/usr/include/stdlib.h:483:25: note: initializing argument 1 of 'void free(void*)'
483 | extern void free (void *__ptr) __THROW;
| ~~~~~~^~~~~
/projects/tools/SCOREC-core/core/phasta/phCGNSgbc.cc:308:17: error: invalid conversion from 'cgsize_t*' {aka 'int*'} to 'gcorp_t' {aka 'long long int'} [-fpermissive]
308 | gcorp_t e = (cgsize_t *)malloc(nvert * e_owned * sizeof(cgsize_t));
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
| |
| cgsize_t* {aka int*}
/projects/tools/SCOREC-core/core/phasta/phCGNSgbc.cc:316:10: error: invalid conversion from 'gcorp_t' {aka 'long long int'} to 'void*' [-fpermissive]
316 | free(e);
| ^
| |
| gcorp_t {aka long long int}
In file included from /usr/local/spack/v0.16/opt/spack/linux-debian8-nehalem/gcc-4.9.2/gcc-11.2.0-te7qrdtsvnii4dc32rg2z7oseu3b2c7s/include/c++/11.2.0/cstdlib:75,
from /usr/local/spack/v0.16/opt/spack/linux-debian8-nehalem/gcc-4.9.2/gcc-11.2.0-te7qrdtsvnii4dc32rg2z7oseu3b2c7s/include/c++/11.2.0/ext/string_conversions.h:41,
from /usr/local/spack/v0.16/opt/spack/linux-debian8-nehalem/gcc-4.9.2/gcc-11.2.0-te7qrdtsvnii4dc32rg2z7oseu3b2c7s/include/c++/11.2.0/bits/basic_string.h:6607,
from /usr/local/spack/v0.16/opt/spack/linux-debian8-nehalem/gcc-4.9.2/gcc-11.2.0-te7qrdtsvnii4dc32rg2z7oseu3b2c7s/include/c++/11.2.0/string:55,
from /projects/tools/SCOREC-core/core/phasta/phInput.h:13,
from /projects/tools/SCOREC-core/core/phasta/phOutput.h:4,
from /projects/tools/SCOREC-core/core/phasta/phCGNSgbc.cc:2:
/usr/include/stdlib.h:483:25: note: initializing argument 1 of 'void free(void*)'
483 | extern void free (void *__ptr) __THROW;
| ~~~~~~^~~~~
/projects/tools/SCOREC-core/core/phasta/phCGNSgbc.cc: In function 'void ph::writeCGNS(ph::Output&, std::string)':
/projects/tools/SCOREC-core/core/phasta/phCGNSgbc.cc:340:26: warning: ISO C++ forbids converting a string constant to 'char*' [-Wwrite-strings]
340 | static char *outfile = "chefOut.cgns";
| ^~~~~~~~~~~~~~
phasta/CMakeFiles/ph.dir/build.make:159: recipe for target 'phasta/CMakeFiles/ph.dir/phCGNSgbc.cc.o' failed
make[3]: *** [phasta/CMakeFiles/ph.dir/phCGNSgbc.cc.o] Error 1
CMakeFiles/Makefile2:1294: recipe for target 'phasta/CMakeFiles/ph.dir/all' failed
make[2]: *** [phasta/CMakeFiles/ph.dir/all] Error 2
CMakeFiles/Makefile2:4997: recipe for target 'test/CMakeFiles/chef.dir/rule' failed
make[1]: *** [test/CMakeFiles/chef.dir/rule] Error 2
Makefile:1557: recipe for target 'chef' failed
make: *** [chef] Error 2
Calling my lifeline favor from @matthb2 not only got me a cgsize_t long int library but also abandoned gcorp_t that had mistaken requests for long long int in many places (including PHASTA). Ben also straightened me out on some interfaces issues. Taken together, this allows me to get to a point to start testing all hex meshes with the most recent commit. Boundary is setup to write but, if I understood @jrwrigh and @jedbrown conversation correctly, we are not really ready on the reading side to handle two connectivities?
if I understood @jrwrigh and @jedbrown conversation correctly, we are not really ready on the reading side to handle two connectivities?
Correct, PETSc currently only handles reading a single section (element connectivity array).
Give me a sample file and I'll see what I can do in the next couple days.
Preparing tests, I think our ultimate, representative of the bump test will be this: (tri face wedges on the wall (yellow) growing BLS off that wall that end when isotropic and capped by tets that can continue isotropic gradation)
But we need both multi-topology volume elements and multi-topology boundary elements. One simplification from there would be this (same wall mesh but all wedge volume mesh)
But we will first test these two mono-topology meshes
Give me a sample file and I'll see what I can do in the next couple days.
Crossing fingers that the last one, all hex, goes through without too many bugs. @jedbrown I will make the first pass with no boundary element connectivity on that one and then the second case with one boundary element set for all 6 faces.
@jrwrigh I put a distinct surface ID on each of the 6 faces and, after the above two cases work, it should be pretty quick to give you an integer list of the same length as the number of boundary elements with a surfID between 1 and 6 to sort those into ZonalBC output.
The good news is that I found and fixed a few bug and it wrote a file and exited cleanly.
The bad news is that the mesh CRASHES paraview so obviously there are more bugs.
The further bad news is that I don't have a working cgnsview
. @matthb2 installed /usr/local/cgns/4.4.0/bin/cgnsview
on our cluster but this just segfaults.
@jrwrigh and @jedbrown, what did you use to get working versions? I assume you both work with linux but Ben mentioned that he suspected what he got might not have bulit cleanly from scratch on Spack (probably miss-quoted) possibly using system HDF5 or something of that sort.
what did you use to get working versions?
Yeah, I just use one that comes with my package-manager installed CGNS. I'm building CGNS on my space instance with +tools
to see if that works too.
If you post the cgns file here (assuming it's small), I can take a look.
Being slightly more clear. cgnsview
segfaults on launch (with or without vglrun). I also crashes with or without being given a file at all (not sure how it is intended to work).
what did you use to get working versions?
Yeah, I just use one that comes with my package-manager installed CGNS. I'm building CGNS on my space instance with
+tools
to see if that works too.If you post the cgns file here (assuming it's small), I can take a look.
I am off to the store but here is the path on the viz nodes
/projects/tools/SCOREC-core/core/pumi-meshes/cube/sms2mdsAllHex/Chef/1-1-Chef/chefOut.cgns
@matthb2 did you try with +tools
?
given the exec @matthb2 pointed me to /usr/loca/cgns/4.4.0/bin/cgnsview
I would guess that is also a package manager install which is probably why Ben called it out of slack. Viz nodes base is really in need of updating which may be why a package manager install won't cut it so it would be good to figure out how to get it to install from Slack.
That file reads fine on my Paraview (5.11.1), but the result looks like the connectivity is wrong:
Any luck looking at it in cgnsview
?
The connectivity is certainly off so I can start looking at that, maybe walk the code in gdb but it would be nice to confirm that other stuff looks right. This suggests PHASTA's and SCOREC/core's ordering matches so that is not the issue. I will have to swap local nodes 2 and 3 for tets...forever grateful for that Chris Whiting. Wedges are also as we number them. At least for linear which is what I want to get worked out first and we can face higher order numbering when we have curved geometry that needs them (or solution transfer I guess).
I've noticed that cgnsview
does not handle relative paths correctly (so either run from the same directory cgnsview file.cgns
or using an absolute path cgnsview /path/to/file.cgns
). Not sure if that's the issue. I don't think I have access to your viz nodes to examine the file.
I have fixed a few bugs but, for some reason when I push to github it is hanging.
I am still crashing paraview when reading the CGNS file produced (n-2 bugs I could see something slightly better than what James showed).
I still can't push so I am going to attach the only file that has been fixed since the last push. In this version I have also taken to echoing all the data passed to CGNS data writers for an 8 element 27 node mesh. I will share a pdf of the box with the 1-based node numbering that to me confirms the coordinates and the connectivity are right so I am guessing it is one of the non-data commands that I don't really know how to check without cgnsview
.
Looks like it won't let me upload a CGNS file or a .cc file. I think you all have access to this space
(base) kjansen@polaris-login-01: /eagle/cfdml_aesp/CGNS-share $ ls -alt
total 44
drwxr-sr-x 3 kjansen cfdml_aesp 4096 Aug 7 06:10 .
drwxrwsr-x 5 kjansen users 4096 Aug 7 06:09 1-1-Chef
drwxrws--- 19 root cfdml_aesp 4096 Aug 7 06:02 ..
-rw-r--r-- 1 kjansen users 13336 Aug 7 06:02 phCGNSgbc.cc
-rw-r--r-- 1 kjansen users 12902 Aug 7 06:01 chefOut.cgns
(base) kjansen@polaris-login-01: /eagle/cfdml_aesp/CGNS-share $ ls -alt 1-1-Chef/
total 22
drwxr-sr-x 3 kjansen cfdml_aesp 4096 Aug 7 06:10 ..
drwxrwsr-x 5 kjansen users 4096 Aug 7 06:09 .
drwxrwsr-x 2 kjansen users 4096 Aug 7 06:09 1-procs_case
-rwxr-xr-x 1 kjansen users 584 Aug 7 06:09 adapt.inp
-rw-r--r-- 1 kjansen users 11074 Aug 7 06:09 chefOut.cgns
-rw-r--r-- 1 kjansen users 32663 Aug 7 06:09 geom.smd
drwxr-sr-x 2 kjansen users 4096 Aug 7 06:09 mdsMesh
drwxr-sr-x 2 kjansen users 4096 Aug 7 06:09 mdsMeshIn
-rwxr-xr-x 1 kjansen users 284 Aug 7 06:09 runChef.sh
(base) kjansen@polaris-login-01: /eagle/cfdml_aesp/CGNS-share $ cd ..
(base) kjansen@polaris-login-01: /eagle/cfdml_aesp $ ls -alt
total 524
drwxr-sr-x 3 kjansen cfdml_aesp 4096 Aug 7 06:10 CGNS-share
Finally, this is what that cc file when compiled in should produce for the echoed data arrays via printf
geombc file written in 0.000718 seconds
1, 27
1, -0.500000
2, 65.000000
3, 65.000000
4, -0.500000
5, -0.500000
6, 65.000000
7, 65.000000
8, -0.500000
9, -0.500000
10, 65.000000
11, 65.000000
12, -0.500000
13, 32.250000
14, 65.000000
15, 32.250000
16, -0.500000
17, 32.250000
18, 65.000000
19, 32.250000
20, -0.500000
21, -0.500000
22, 32.250000
23, 65.000000
24, 32.250000
25, 32.250000
26, 32.250000
27, 32.250000
1, 27
1, 65.000000
2, 65.000000
3, -0.500000
4, -0.500000
5, 65.000000
6, 65.000000
7, -0.500000
8, -0.500000
9, -0.500000
10, -0.500000
11, 65.000000
12, 65.000000
13, 65.000000
14, 32.250000
15, -0.500000
16, 32.250000
17, 65.000000
18, 32.250000
19, -0.500000
20, 32.250000
21, 32.250000
22, -0.500000
23, 32.250000
24, 65.000000
25, 32.250000
26, 32.250000
27, 32.250000
1, 27
1, 65.000000
2, 65.000000
3, 65.000000
4, 65.000000
5, -0.500000
6, -0.500000
7, -0.500000
8, -0.500000
9, 32.250000
10, 32.250000
11, 32.250000
12, 32.250000
13, 65.000000
14, 65.000000
15, 65.000000
16, 65.000000
17, -0.500000
18, -0.500000
19, -0.500000
20, -0.500000
21, 32.250000
22, 32.250000
23, 32.250000
24, 32.250000
25, 65.000000
26, -0.500000
27, 32.250000
1, 8
1, 4, 15, 22, 9, 16, 25, 27, 21
2, 16, 25, 27, 21, 1, 13, 24, 12
3, 15, 3, 10, 22, 25, 14, 23, 27
4, 25, 14, 23, 27, 13, 2, 11, 24
5, 9, 22, 19, 8, 21, 27, 26, 20
6, 21, 27, 26, 20, 12, 24, 17, 5
7, 22, 10, 7, 19, 27, 23, 18, 26
8, 27, 23, 18, 26, 24, 11, 6, 17
mesh verified in 0.000939 seconds
now that I push all up, git push finally worked.
Note Aug 6, 2024.pdf
A more rested brain checking the data I used printf to dump to the screen again I see no errors. As you can see in the code I am using printf to write the x
array that I pass to CGNS (just after start node number and end node number) when it is x_1
and then repeating that when it is x_2
and then x_3
.
After that I am printing the element number followed by its 8 nodes in a 1-based numbering which I labeled in the pdf and then used check marks in 8 colors, one for each line to confirm that I am passing a volume element connectivity with element numbers ranging from 1 to 8 (that is not actually passed but the range is) and whose first 8 entries are element 1, second 8 are element 2, and so on.
Am I misunderstanding what CGNS wants? I mean, the entries range from 1 to 27 so even I was scrambling I would expect a tangled mesh like we saw before but I don't even get that so I am guessing it must be one of the headers that is not what CGNS is expecting.
On that, I more or less cut and pasted code from here. There were a lot of instances where the C++ compiler hinted that the "variable" I passed to CGNS was not recognized and it usually suggested that maybe I "meant" CG_<variable>
which I assumed to mean that example has not been updated to follow their new labels (which I would be tempted to throw stones at but then looking at PHASTAs lack of documentation I won't).
In short, I am out of ideas to debug so I am really hoping someone with a working cgnsview
can use it to probe the file for anomalies. I will re-check the lines as best I can. Is there a way to ask it to write the file in ascii rather than binary?
With respect to the chefOut.cgns
on Polaris, there is a more fundamental error.
$ cgnslist chefOut.cgns
cgio_open_file:ADF 16: Internal error: Memory boundary tag bad.
$ h5dump chefOut.cgns
h5dump error: unable to open file "chefOut.cgns"
OOOH. Is it possible Paraview-CGNS is not picking up on the 64 bit integers? If so that really sucks that the interface and/or HDF5 file is not communicating that and/or Paraview is not detecting that?
@jrwrigh can you try to load this file into PETSc (which I have more faith in being able to recognize 64 bit) to see if it loads properly there?
@jedbrown does that test need to be done with a 64 bit PETSc or will a 32 bit PETSc build be able to read an HDF5 under CGNS that is 64 bit?
The file isn't valid hdf5, like perhaps wasn't closed correctly after writing.
With respect to the
chefOut.cgns
on Polaris, there is a more fundamental error.$ cgnslist chefOut.cgns cgio_open_file:ADF 16: Internal error: Memory boundary tag bad. $ h5dump chefOut.cgns h5dump error: unable to open file "chefOut.cgns"
Same Q. Do these checking tools need to be a build of CGNS with 64 bit or are they able to read files written in both 32 bit builds and 64 bit builds?
I will review the example file again to be sure I have kept straight which variables where supposed be typed int
vs typed cgsize_t
but I am not sure I trust the example since it won't compile against modern versions of the library. I guess I should review against the PETSc example but it is not easy for me to do because so much stuff goes through enums that I am not so clear on.
I guess I was blindly trusting/hoping that the C++ compiler was going to tell me any place the library wanted type different than what I was passing. I feel like it did that a lot but maybe there are instances it let it slide and that is a problem?
The file isn't valid hdf5, like perhaps wasn't closed correctly after writing.
I just wrapped the close in an error check, compiled, and ran again and there is no error so.....
(base) kjansen@viz003: /projects/tools/SCOREC-core/core (CGNS_OneBase)$ git diff
diff --git a/phasta/phCGNSgbc.cc b/phasta/phCGNSgbc.cc
index e38b5b2b..dd37e565 100644
--- a/phasta/phCGNSgbc.cc
+++ b/phasta/phCGNSgbc.cc
@@ -418,7 +418,7 @@ void writeCGNS(Output& o, std::string path)
*/
writeBlocksCGNS(F,B,Z, o);
- cgp_close(F);
+ if(cgp_close(F)) cgp_error_exit();
// if (!PCU_Comm_Self())
// lion_oprint(1,"CGNS file written in %f seconds\n", t1 - t0);
}
diff --git a/pumi-meshes b/pumi-meshes
--- a/pumi-meshes
+++ b/pumi-meshes
@@ -1 +1 @@
-Subproject commit c00ba9c16cacbb361ee538c03a3ec694ddb989f2
+Subproject commit c00ba9c16cacbb361ee538c03a3ec694ddb989f2-dirty
I can scp the latest file after the error wrap but I guess I would assume it is not any different since the above is the only code change and there was no error. Maybe I am missing something as I follow the example with that error wrap (though I see now the close is the one command they do not do that for so maybe that does not work). I guess I can also put an error wrap on the connectivity file that I did not do yet to see if it tells me anything.
every cgxxx
function is wrapped to exit with error like in the example (even ones they don't) and I recompiled and re-ran with no errors. Whatever it does not like is not caught by its error handlers.
Are we sure it (whatever @jedbrown is running) has the capacity to read cgsize_t
set to long
?
From the example I am following
Last updated 06 July 2013
@jrwrigh if you have a 32 bit CGNS, can you try to build up this branch on your laptop after altering cgsize_t to be int
instead of long int
to test this there? If that works, trying both with flopped to 64 bit would be the next test...well both plus your cgnsview
if 64 fails as that would confirm that everybody has to be 32 or 64 with no ability to detect and adjust.
To be clear, the file I have written has every cgpxxx call in the example except the following which I assumed were optional. Are they not optional?
/* create a centered solution */
if (cg_sol_write(F, B, Z, "Solution", CellCenter, &S) ||
cgp_field_write(F, B, Z, S, RealSingle, "CellIndex", &Fs))
cgp_error_exit();
/* create the field data for this process */
d = (float *)malloc(nelems * sizeof(float));
nn = 0;
for (n = 1; n <= tot_nelems; n++) {
if (n >= start && n <= end) {
d[nn] = (float)n;
nn++;
}
}
/* write the solution field data in parallel */
if (cgp_field_write_data(F, B, Z, S, Fs, &start, &end, d))
cgp_error_exit();
/* create user data under the zone and duplicate solution data */
ncells = tot_nelems;
if (cg_goto(F, B, "Zone_t", 1, NULL) ||
cg_user_data_write("User Data") ||
cg_gorel(F, "User Data", 0, NULL) ||
cgp_array_write("CellIndex", RealSingle, 1, &ncells, &A))
cgp_error_exit();
/* write the array data in parallel */
if (cgp_array_write_data(A, &start, &end, d))
cgp_error_exit();
@cwsmith Can you comment on what -DMDS_ID_TYPE=long
as a flag to cmake
does? Is it replacing every int
declaration by long (in which case I am throwing 64 bit integers at CGNS for variables declared as int
). I suppose I need to learn cmake
well enough to figure this out myself from
(base) kjansen@viz003: /projects/tools/SCOREC-core/core (CGNS_OneBase)$ grep -ri MDS_ID_TYPE *
example_config_with_python_interface.sh: -DMDS_ID_TYPE=int \
mds/CMakeLists.txt:set(MDS_ID_TYPE "int" CACHE STRING "Interal identifier integer type")
mds/CMakeLists.txt:message(STATUS "MDS_ID_TYPE: ${MDS_ID_TYPE}")
mds/mds_config.h.in:#define MDS_ID_TYPE @MDS_ID_TYPE@
mds/pkg_tribits.cmake:set(MDS_ID_TYPE "int" CACHE STRING "Interal identifier integer type")
mds/mds.h:typedef MDS_ID_TYPE mds_id;
mds/mds.c: lion_eprint(1, "please recompile with -DMDS_ID_TYPE=long\n");
If so, what it the "escape" to declare a 32 bit integer as CGNS may require? @matthb2 is it feasible to have both a 32bit CGNS and a 64bit version so that I can better test this issue? I assume that it is bad form to have two "users" adding stuff to spack or, I guess if you did it as root, I should be able to be root and do that myself as well. I suppose I could also just build up my own spack but that is a double disk hog.
But I am perplexed by the fact that C++ was verbosely scolding me every time I tried to pass a variable of the wrong type so why is it not seeing this issue if I am in fact throwing long ints that I declared as int at the variables it expect to be ints? Further, my printf is printing with %d for ints and %ld for long ints as expected (that usually also complains if it thinks otherwise AND the numbers look right).
Yeah, I doubt that's the issue (at least passing a &outvar
or array of the wrong type will not be accepted, and passing an invar
will be coerced to the correct type). Setting MDS_ID_TYPE
just affects the mds_id
typedef.
Probably you can run the test program as-is from the CGNS repository to create a sample file and confirm that it's valid. (The CGNS test suite does lots of writing files and reading them back to inspect.)
Problem seems to be resolved. Actually seems to have been resolved some time ago and I missed that Paraview was loading different files than I expected.
@jedbrown Please let me know:
CGNS tets read by PV.
Prepare and write CGNS boundary connectivity and ZoneBC in one base
This is a WIP as we hope to coordinate the code development through this PR. Currently SCOREC/core prepares a broad variety of meshes, boundary conditions, initial conditions, and partitions of the same for the PHASTA flow solver with the application chef. That data is currently written for each part of the partition in what is referred to as PHASTA POSIX file format.
CEED-PHASTA, being built on PETSc already has CGNS writing and growing CGNS reading capability. The PR seeks to leverage meshes already in the mds database (that are usually classified against a model that has initial condition and boundary condition attributes). The developments of the PR will further leverage the routines that extract the volume mesh (coordinates and connectivity), initial conditions on the volume mesh entities, and the boundary element mesh (one boundary element connectivity per surface). Once this data is extracted into data structures, the existing CGNS writing capability will be modified to write all the above data in a single "base" so that it can be read by PETSc and applied in the CEED-PHASTA solver.
The first test will be a simple flow through a box but more extensive tests will be developed as more capabilities are added.
Here is the required cmake configuration. Our testing was done within Spack using gcc 11.2, cgns (latest 4.4.0), ompi 4.1.1 (and whatever dependencies they required). We also used spack versions of zoltan and parmetis. Simmetrix is not really required but 10 of the 13 tests used a Simmetrix mesh and the version of simModSuite was 18.0-230111.