creating large matched meshes from connectivity and coordinates

cwsmith commented 5 years ago

A large structured hex (eventually mixed) mesh is needed and there are issues getting a mesh generator to easily produce this.

The matchedReader branch contains a driver named matchedNodeElmReader that calls the apf::Construct api to create an mds/pumi mesh from a files containing connectivity and coordinates.

There are no new/special compilation flags needed to build this branch of core and the driver.

Run the driver with the following command:

mpirun -np <numProcs> ./matchedNodeElmReader box_small.cnn box_small.crd NULL outModel.dmg outMesh.smb

NULL is the place holder argument for the file containing the matching info.

The box_small files are attached in a tarball: box_small.tar.gz

The format for the .cnn, .crd, and .match files is:

$ head -5 box_small.cnn
1 1 2 53 52 2602 2603 2654 2653
2 2 3 54 53 2603 2604 2655 2654
3 3 4 55 54 2604 2605 2656 2655
4 4 5 56 55 2605 2606 2657 2656
5 5 6 57 56 2606 2607 2658 2657
….
el# global_node1:8

$ head -5 box_small.crd
1 0.000000 0.000000 0.000000
2 0.000000 0.000000 0.020000
3 0.000000 0.000000 0.040000
4 0.000000 0.000000 0.060000
5 0.000000 0.000000 0.080000
nd# x y z

$ head -5 box_small.match
1 51
2 -1
3 -1
4 -1
5 -1
node# matching node if not -1

@rickybalin

cwsmith commented 5 years ago

Pasting Ken's email below

Cameron and I met today to discuss the plan for this effort. He is close to having something to share.

Remove counter from the first column of all files and assume that line number is the global entity number (with obvious adjustment for multiple topology connectivity):

file 1 coordinates (forgot what we named it) Coordinates numVerts 3 x1 x2 x3
(repeat for remainder (numVerts -1) times)

File 2 connectivity (again forgot what we called it)

numElmThisTop nElmNodes gn1 gn2 …. nElmNodes (repeat for remainder (numElmThisTop -1) times) Repeat header and rectangular array for each topology. Could put into different files initially if easier and then cat the files to create File 2.

File 3 Matching numVerts 1 -1 if unmatched Node number matched with

File 4 Classification numVerts 1 0=fully interior of the volume 1-6 =classified on face (not edge or vertex) 11-22 = classified on model edge (not end points which are model vertices) 31-38 = classified on a model vertex.

I suppose on this last one we should agree on a master numbering. Cameron, I know SCOREC used to have canonical numbering for hex elements and if you can find that we should stick with it (it probably also exists in Simmetrix documentation too)?

File 5) Boundary conditions We should probably recycle our existing flat text smd or spj file format with the only change being that model tag is replaced by the classification tag described in File 4.

Riccardo mentioned in Slack that he has a 1B element case ready for testing and called it his largest so I suspect he has some in between which would be good.

Current plan: 1) Cameron is going to push a version of his code that ignores matching to github as soon as he confirms it works on the already provided 175k element on 1, 2, and 4 target mds parts 2) Riccardo and I will retest 1) 3) Riccardo and I will begin to test progressively larger cases 4) Cameron will move on to add features in the following order: a) matching capability b) classification ingestion c) Boundary condition ingestion and connection a), b) and c) such that Chef has what it would have gotten from smd and mesh previously. 5) Riccardo and I test features as they get added and obviously run PHASTA to verify that all the information is coming through

We are leaving Initial conditions out but Riccardo can correct me if I am wrong, we have already in the codes that he is using a way to overwrite good, spatially varying initial and boundary condition values. The latter may or may require correct BC codes but I think the above will do that.

Let me know if I have forgotten anything or gotten any of if differently than we discussed today.

Best regards,

Ken

cwsmith commented 5 years ago

The parallel reader now works.

KennethEJansen commented 5 years ago

Am I correct in assuming that this is a separate repo from SCOREC/core, rather it is a personal repo. Can you provide some instructions for how best to check this out as a branch "under" our standard repo. I guess I am not git-savy enough to do that since for PHASTA we keep everything under either the public or the next repo which are ultimately both "linked" on github.

cwsmith commented 5 years ago

cd /path/to/existing/core/repo
git remote add cws git@github.com:cwsmith/core.git
git fetch cws
git checkout matchedReader

On my system this looks like:


~/develop/core (develop)$ git remote -v
origin  git@github.com:SCOREC/core.git (fetch)
origin  git@github.com:SCOREC/core.git (push)
~/develop/core (develop)$ git remote add cws git@github.com:cwsmith/core.git
~/develop/core (develop)$ git fetch cws
remote: Enumerating objects: 75, done.
remote: Counting objects: 100% (75/75), done.
remote: Compressing objects: 100% (29/29), done.
remote: Total 75 (delta 53), reused 68 (delta 46), pack-reused 0
Unpacking objects: 100% (75/75), done.
From github.com:cwsmith/core
 * [new branch]        apf_sim_config          -> cws/apf_sim_config
 * [new branch]        chef                    -> cws/chef
 * [new branch]        cws/control_print_stmts -> cws/cws/control_print_stmts
 * [new branch]        dcVtx                   -> cws/dcVtx
 * [new branch]        develop                 -> cws/develop
 * [new branch]        fusion_dev              -> cws/fusion_dev
 * [new branch]        generateAdvanced        -> cws/generateAdvanced
 * [new branch]        liipbmod                -> cws/liipbmod
 * [new branch]        master                  -> cws/master
 * [new branch]        matchedReader           -> cws/matchedReader
 * [new branch]        mis-work                -> cws/mis-work
 * [new branch]        phastaChef_mesh_measure -> cws/phastaChef_mesh_measure
 * [new branch]        sim_getDgCopies         -> cws/sim_getDgCopies
 * [new branch]        ugridhacks              -> cws/ugridhacks
~/develop/core (develop)$ git checkout matchedReader
Branch 'matchedReader' set up to track remote branch 'matchedReader' from 'cws'.
Switched to a new branch 'matchedReader'
~/develop/core (matchedReader)$

KennethEJansen commented 5 years ago

Thanks. I have successfully built the code on the viz nodes and repeated the 175k test. I further tested the following larger cases

Parts	Mesh size	Time to Create (s)	Memory Peak (GB)	Time to Write smb (s)
4	8M	65	4x1.7	26
4	125M	999	4x27	350
8	""	1012	8x13.5

(note it is called 150M but it is actually 125M according to the code). As expected this is linear in time and memory for 4 parts so all is looking good at this point. The mesh creation does not seem to benefit from more cores/parts but maybe that is expected?

Thanks for adding the render function and output. I did verify the 175k and 8M cases in paraview

I will have to move to Cooley to test the 1B element case since it is projected to need > 800GB.

KennethEJansen commented 5 years ago

kjansen@cc014: /projects/TRAssembly_2/kjansen/Models/BoeingBump/matchedReaderTest/1B $ mpirun -np 32 --hostfile $COBALT_NODEFILE /projects/TRAssembly_2/SCOREC-core/build-matchedReader/test/matchedNodeElmReader box_WMLES_1B.cnn box_WMLES_1B.crd NULL mod mesh/ numElms 1000000000 numVerts 1003003001 isMatched 0 seconds to create mesh 3589.092 mesh verified in 158.253273 seconds mesh mesh/ written in 22.195060 seconds writeVtuFile into buffers: 129.821842 seconds writeVtuFile buffers to disk: 1.203647 seconds vtk files rendered written in 136.573262 seconds

Looking in top on Cooley on the above case I had 8 nodes so 4 processes per node. One process seems far ahead of the other 3. It is running at 100% while the other three are at 60% with a Status of D. Presumably they are struggling to read from a single file. At this point all are in the compute phase Tasks: 364 total, 5 running, 359 sleeping, 0 stopped, 0 zombie %Cpu(s): 28.2 us, 5.5 sy, 0.0 ni, 66.2 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st KiB Mem : 39597302+total, 30272150+free, 90494048 used, 2757468 buff/cache KiB Swap: 5242876 total, 4956576 free, 286300 used. 30331353+avail Mem

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
49976 kjansen 20 0 23.0g 18.8g 10296 R 100.0 5.0 37:28.15 matchedNodeElmR
49977 kjansen 20 0 19.1g 16.7g 10300 R 100.0 4.4 36:23.76 matchedNodeElmR
49974 kjansen 20 0 31.2g 24.1g 10316 R 96.2 6.4 54:27.47 matchedNodeElmR
49975 kjansen 20 0 23.0g 18.9g 10300 R 96.2 5.0 34:57.68 matchedNodeElmR
7344 kjansen 20 0 177836 3324 2180 R 3.8 0.0 0:18.54 top

but note the 18 minute gap. Earlier the ones at 34 minutes were hovering at 1.6G but now they are done reading I guess and processing elements.

Here are the results from 64 processes

kjansen@cc014: /projects/TRAssembly_2/kjansen/Models/BoeingBump/matchedReaderTest/1B $ mpirun -np 64 --hostfile $COBALT_NODEFILE /projects/TRAssembly_2/SCOREC-core/build-matchedReader/test/matchedNodeElmReader box_WMLES_1B.cnn box_WMLES_1B.crd NULL mod mesh/ numElms 1000000000 numVerts 1003003001 isMatched 0 seconds to create mesh 3433.222 mesh verified in 87.809694 seconds mesh mesh/ written in 13.419787 seconds writeVtuFile into buffers: 72.740630 seconds writeVtuFile buffers to disk: 0.592511 seconds vtk files rendered written in 76.710356 seconds

KennethEJansen commented 5 years ago

Note all of the above is with NULL passed for matching. Please let me know when you think the code is ready to test matching. Also, it looks like you have not made the jump to the new file formats yet (e.g., your code is still not expecting a header and is reading id as the first column on crd and cnn at least).

KennethEJansen commented 5 years ago

I think we are facing a tradeoff between IO bottlenecks at large process/per node and the scalability of the mds element creation. To test the extreme of this I tried one process per node but as as you can see below, this fails on an INT_MAX limit:

jansen@cc014: /projects/TRAssembly_2/kjansen/Models/BoeingBump/matchedReaderTest/1B $ mpirun -np 8 --hostfile $COBALT_NODEFILE /projects/TRAssembly_2/SCOREC-core/build-matchedReader/test/matchedNodeElmReader box_WMLES_1B.cnn box_WMLES_1B.crd NULL mod mesh/ numElms 1000000000 numVerts 1003003001 isMatched 0 ERROR PCU message size exceeds INT_MAX... exiting ERROR PCU message size exceeds INT_MAX... exiting [cc015:mpi_rank_2][error_sighandler] Caught error: Aborted (signal 6) [cc009:mpi_rank_5][error_sighandler] Caught error: Aborted (signal 6) ERROR PCU message size exceeds INT_MAX... exiting [cc003:mpi_rank_6][error_sighandler] Caught error: Aborted (signal 6) ERROR PCU message size exceeds INT_MAX... exiting [cc010:mpi_rank_3][error_sighandler] Caught error: Aborted (signal 6) ERROR PCU message size exceeds INT_MAX... exiting [cc004:mpi_rank_4][error_sighandler] Caught error: Aborted (signal 6) ERROR PCU message size exceeds INT_MAX... exiting [cc005:mpi_rank_1][error_sighandler] Caught error: Aborted (signal 6) ERROR PCU message size exceeds INT_MAX... exiting [cc014:mpi_rank_0][error_sighandler] Caught error: Aborted (signal 6)

cwsmith commented 5 years ago

Thanks for testing. Glad to see the tool worked for 1B elements. I've added that result to the table.

Parts	Mesh size	Time to Create (s)	Memory Peak (GB)	Time to Write smb (s)
4	8M	65	4x1.7	26
4	125M	999	4x27	350
8	125M	1012	8x13.5
32	1B	3589	32x31?	22
64	1B	3433	?	13
8	1B	Fail
16x1	1B	2807	?	39
16x2	1B	3472	?	18.7
16x4	1B	2906	?	14

I think we'll have to move to MPI_IO and binary files to get the read times down. This would append another bullet on item 4 from here. Matlab supports binary file writing: https://www.mathworks.com/help/matlab/ref/fwrite.html

Matching is not supported yet. I'll post an update here when it is ready for testing.

KennethEJansen commented 5 years ago

Since this approach appears promising, I am going to take some time here to plan the remaining steps to putting this into a production run. I will put tentative assignments via initials but of course we can redistribute as needed: 1)(DONE) Preliminary testing for meshes up to 1B . This is nearly complete as noted above by KEJ.
2) (DONE) Adding matching capability (CWS). see #211 3) Switch over to the header plus removal of ID column (CWS). 4) Adding classification capability (CWS). 5) Creation of "real" meshes (current ones are isotropic boxes) with spatially varying 2D boundary layers and vertical filling to top boundary (RB). Retest should be easy since, other than bugs, it is just coordinate changes. After coords and connectivity are checked "mesh generation" code needs to add capability to write the model classification file described above. Should also prepare BC/IC flat text file. 6) (DONE CWS confirms flat text will work) Adding the load of flat text BC and IC description a la spj or smd (CWS or other if needed since I think there is a prospect that what we have will just work as long as 4) is done in a way that is compatible). 7) With 6) complete we should be able to start testing in PHASTA (RB). 8) Push this up to our larger target meshes (RB). 9) Adding Multi-topology (CWS) 10) Test 8) and 9) in PHASTA 11) Adding MPI_IO and binary (CWS). This is a low priority for now. Of course it would be nice but I think we can get to > 10B without it. 1B element case on 16 nodes with one process takes about 1/2 an hour to read the mesh. KEJ has modified the code to measure this directly and will get data on that at 2 and 4 processes per node as time allows but since we were able to get 1 B through on 8 nodes, I suspect we can get 10B through on 80 (or probably much less) which we have in Cooley. I further suspect that it should be reasonably straightforward to do the preliminary element partitioning in the same code that creates the coordinates and connectivity such that the elements can be written to already parted files. It might even be possible too write the coordinates (and other vertex-based data files to separate files. This would likely speed up the data ingestion. That said, I am keeping this as the lowest priority because we are not likely to have to do this enough times to justify a huge effort to make this stage as fast as it could be and I don't want us to spend time (both sides) on that fine tuning until we get the first 10 tasks verified.

cwsmith commented 5 years ago

What is the notation in the parts column? Is it x ? No. It is nodes x proccesses_per_node

The list of todo items looks good to me.

Is 10B the upper limit for mesh size? Yes, near term.

rickybalin commented 5 years ago

Yes 10B should be the upper limit.

Riccardo Balin Doctoral Candidate Smead Aerospace Engineering Sciences University of Colorado, Boulder Riccardo.Balin@Colorado.EDU

On Mon, Jan 7, 2019 at 1:58 AM Cameron Smith notifications@github.com wrote:

What is the notation in the parts column? Is it x ?

The list of todo's looks good to me.

Is 10B the upper limit for mesh size?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/SCOREC/core/issues/205#issuecomment-451792516, or mute the thread https://github.com/notifications/unsubscribe-auth/AKCQcFtiEPrZx7SoyNQFHLcjP8puf-rwks5vApvOgaJpZM4Ztyry .

rickybalin commented 5 years ago

To update on number 5 on the list of steps above, I am able to generate meshes of the desired geometry and with the desired features (it is missing a spatially varying BL height but that is easy to add).

Here is a picture of the mesh I created. Note this is only a 35k element mesh, but should be good enough to test that this workflow is successful with a simple RANS simulation.

mesh

KennethEJansen commented 5 years ago

This looks good. I am going to add to task 5 what you probably already know goes with it -- modifying your code to write the model classification and boundary condition files which are needed by task 6.

rickybalin commented 5 years ago

I started working on that, but could not find a rule of thumb for the order of numbering of the geometric entities (faces, edges and vertices). I am making up one, but will be able to modify the code with another order easily.

KennethEJansen commented 5 years ago

Unless Cameron says otherwise, I suggest you follow the order of the connectivity array. Model vertices start with 0 at the "origin" which we can define as minx, miny, minz and then there are three other model vertices which cycle in an order that their curl points into the domain. These 4 vertices define the first four edges as well (e0 = v1-v0, e1= v2-v1, etc). These first 4 edges define face 0 (f0). The other 4 model vertices follow the same order but shifted into the third dimension (e.g v4 is a third dimension displacement from v0 and others shift this relationship by 1 so that v7 is extension of v3). The next 4 edges are numbered such that their origin is v{0..3} and their endpoint is v{4...7} (e.g., they are directed into the third dimension (e.g., e4=v4-v1)). The last 4 edges are "parallel" to the first 4 and form the loop for the face opposite face 0 (e.g., e8=v5-v4). The next 4 faces are formed such that their "base" is on the first 4 edges (e.g., f1 has a base with e0 and in general f{1..4} has a base with edge{0..3}. Finally, face 5 closes off the box opposite of f0.

rickybalin commented 5 years ago

Code for the classification file is also done, and I gave the file a ".class" extension. This file has the mesh node global ID in the first column, and the classification tag in the second column, and no title since it seems like the title is not wanted in the other files.

For the BC and IC, I simply copied the .spj file that had the desired boundary and initial conditions for a .smd model with the same geometry, and modified the face and model tags to match what was done in the classification file. The tag I used for what Simmetrix and SimModeler call the region is 0 because we said to give a classification of 0 to the fully interior nodes, but I can change that if 0 is not an appropriate value.

This should complete the set of files required, which for the hill geometry can be found on the viz nodes under the name "2DGaussHill_test" at: /users/rbalin/NSF_DataDrivenTurbMod/PrelimTests/matchedNodeElmReader/2Dhill/test

cwsmith commented 5 years ago

@rickybalin Looking back at the box_small case there appears to be problem with the matching file. Specifically, if vtx 1 is matched with vtx 51, as indicated by line one,

$ head -n 1 box_small.match 
1 51

I'd expect to see that vtx 51 was matched with vtx 1 on line 51. Instead it looks like vtx 51 is not matched.

$ sed '51q;d' box_small.match 
51 -1

rickybalin commented 5 years ago

That was intentional. I made that script so that the nodes on the z_min face would be matched to the "master" nodes on the z_max face, but not the other direction.

I will change it so the nodes on both the periodic faces point to the corresponding node on the other face.

The file box_small.match in /users/rbalin/NSF_DataDrivenTurbMod/PrelimTests/matchedNodeElmReader/channel/small has been changed accordingly.

KennethEJansen commented 5 years ago

Cameron, if it makes your matching work easier, It would be easy for us to restrict ourselves to cases where there was only on-part matching.

This is as easy as being sure to choose numVerts/numParts=k * Nz where k is strictly an integer, Nz is the number of points in the matched direction (z in this case), numVerts is the the total number of vertices and numParts is the number of parts we will stream this into.

Since numVerts=Nx Ny Nz for tensor product grids, this is equivalent to Nx Ny=numParts k

which also says there are k, Nz length lines of nodes on each part.

This is a super easy constraint for us to live with if it makes your life easier. Even if we drop the tensor product grids when we go to multi-topology, it would not be hard to order those vertices such that they always end up on the same part after the streaming.

KennethEJansen commented 5 years ago

We should also plan for which stage we will switch the file formats (e.g., add the header with the array shape and ditch the ID column) .

cwsmith commented 5 years ago

See #211 for progress on matching and other test cases.

Moving @KennethEJansen posts on classification here:

vertex: whatever it is classified on

edge classification: An edge in the mesh will have 2 vertices. From their model classification the following classification pairs are possible and those pairs -> edge_classification v v ->e v e ->e v f -> f v r -> r e e -> e e f -> f e r -> r f f -> f f r -> r r r -> r

face classification: a mesh face will have 3 or 4 mesh vertices with model classification. if any of the vertices are classified on a model region -> region else ->model face It is impossible to have more than one model face in the 3 or 4 vertice's classification

region -> region

KennethEJansen commented 5 years ago

Since, in the above, there is a general rule that classification is always set by the classification of the node on the closure with the highest dimension classification, I think you can encapsulate all the above with entClass=max(nodesOfClosureClass) or loop over the nodes of the entity and assign the min of the node's closure to the entity

Because we have numbered higher dimensional model entities with higher number this works. Initially I thought it would be good to change our model entity numbering to be 10^d+entNumberD where d is the dimension of the entity and entNumberD was counting of entities of dimension D but, since we always went to higher numbers when we started numbering entities what we had already is sufficient with a min instead of a max.

The only thing this does not cover is the v v -> e which we already discussed is not relevant to this code.

rickybalin commented 5 years ago

Task list for Classification: 1) Set up model adjacency (what model faces close model region, what model edges close each model face, ...). Do we have to do this? We might have to do this if we are going to use inheritance of boundary conditions. 2) For each entity in the mesh, set model classification. a) every mesh region classified on a model region b) every mesh face classified on a model face or model region c) ...

Ken and I made some progress in /users/rbalin/matchedReader/core/test/matchedNodeElmReader.cc. We have a version that compiles, but crashes during the call to apf::setCoords(mesh, m.coords, m.localNumVerts, outMap), before any call to classification related code. It does not seem to be something we broke adding our modifications for classification, but I am still in the process of understanding why it happens.

Ken, I had to modify our initial work because of how setModelEntity() is defined, but it should do the same thing.

cwsmith commented 5 years ago

Can you confirm that commit https://github.com/cwsmith/core/commit/75db2593d0d495b96ca0b3148c7d98aa01c75295 builds and runs? The code I added for reading classification etc. may be the cause of the setCoords crash you see.

rickybalin commented 5 years ago

It builds, but does not run. Stops at the same point in apf::setCoords, line 224.

cwsmith commented 5 years ago

OK. Thank you for checking. I should be able to debug this for about an hour or so tomorrow.

rickybalin commented 5 years ago

Thank you, please let us know if there is more we can do to help.

KennethEJansen commented 5 years ago

What broke that version was that, when you were testing 2D you added the reading of three lines at the top of the .cnn file 1 1 nsd numel nvtxPerElement

adding these three lines to our cnn file gets https://github.com/cwsmith/core/commit/75db2593d0d495b96ca0b3148c7d98aa01c75295 to run through (have not checked the result).

cwsmith commented 5 years ago

Ahh yes. Thank you. I added that to deal with 2d meshes and multiple topologies.

rickybalin commented 5 years ago

Is there a way to get the coordinates of a MeshEntity when that entity is a mesh vertex? And is there a way to get the coordinates of Adjacent vertices that come out of getAdjacent when it is called on a mesh face or edge (i.e. apf::MeshEntity* e; apf::Adjacent verts; mesh->getAdjacent(e,0,verts);)?

I tried looking in https://scorec.rpi.edu/~seol/scorec/doxygen/apfMesh_8h.html but couldn't see something useful.

Thanks

hapfang commented 5 years ago

Not sure if it is directly applicable in your case, but here is an example to access the vertex coordinates: https://github.com/PHASTA/core/blob/3d5071a1af31c65a0c81b27af00e955472b11050/phasta/phAdapt.cc#L158-L170

rickybalin commented 5 years ago

Thanks Jun, that helped!

KennethEJansen commented 5 years ago

Three points:
1) Above, I have fixed mistakes about how the model classification of the set of vertices for a given mesh entity control what model that mesh entity should be classified on. I had it backwards the first time I wrote it when I said it would be the max(entity_closure_vertex_classification) because I forgot we numbered highest dimensional entities first. I think I have replaced all the times I said "max" with "min". The code that Riccardo and wrote yesterday uses this min characteristic. 2) Above we made a task list to get classification working. In the first task we left as a question whether it was necessary to actually create all the model entities (e.g., 6 model faces, 12 model edges, and 8 model vertices) and add them to the model which, as I understand it, only has the notion of a model region at the start? Further, do we need to also define the topology of that model? As noted above I think we will need this if we expect inheritance of BC's to work (e.g., typically we set BCs only on model faces but expect entities classified on the model edge to inherit or boolean the two model face BC's to the model edge or more specifically, to know what BC to place on mesh entities classified on model edges). Assuming the answer is yes, Cameron, can you point us to any example code that adds model faces, edges, and vertices to a gmi model? Further, can you point to examples that have defined the model topological relation between the model entities (e.g., all the model faces that create the shell of a model region, the model edges which form a loop around a model face, and finally, the model vertices which define the start and end of a model edge. If that is HARD, could we get away with not using inheritance and listing every BC on every model entity? 3) Is it really necessary for us to classify mesh edges and mesh regions? Put another way, what happens if we don't under the assumption that we are never going to probe that classification? The only time we would probe edges is if we were using the hierarchical basis which puts DOF's of mesh edges. AFAIK, our IC's only are on vertices and don't even use classification at all. Essential BC's are assigned to mesh vertices only in this case. Natural boundary conditions are probably built up from iterators over mesh faces and do use their classification on model faces. Region iterators do exist to find connectivity but this does not involve classification.

To be clear, what Riccardo and I have written thus far just does the following: 1) find the min vtx classification for the verts of an entity by pulling them out of getIntTag(v,vtxClass,&c) 2) based on the min value of "c" across the vertex list AND the value of c we know what dimension that entity should be classified on e.g.,

if (c == INTERIORTAG) { mesh->setModelEntity(v,getMdlRgn(model)); } else if (c >= FACE && c <= FACE_LAST) { mesh->setModelEntity(v,getMdlFace(mesh,c)); } else if (c >= EDGE && c <= EDGE_LAST) { mesh->setModelEntity(v,getMdlEdge(mesh,c)); } else if (c >= VERTEX && c <= VERTEX_LAST) { mesh->setModelEntity(v,getMdlVtx(mesh,c)); }

BUT I am unclear if this does the write thing if those model faces, edges, and vertices were not only created but also have model adjacency/topology prior to this classification (or after???).

cwsmith commented 5 years ago

@KennethEJansen 2a. No geometric model entities exist prior to the assignment of classification. AFAIK, you can get through mesh verification with all mesh entities classified on a single region. Likewise, I think the number of geometric model entities classified against is up to the user. 2b. I'll look into this. I have not created the topology of a geometric model before. I don't see how inheritance would work without the geometric model topology defined.

Yes; all mesh entities need to have classification set.

cwsmith commented 5 years ago

This looks like it could be helpful:

https://github.com/SCOREC/core/blob/7ba0ce41fee3369b8fa8f0233799552ab721d5d4/mds/apfBox.h

https://github.com/SCOREC/core/blob/master/mds/apfBox.cc

the user facing function

https://github.com/SCOREC/core/blob/7ba0ce41fee3369b8fa8f0233799552ab721d5d4/mds/apfBox.h#L64-L74

wants to create the mesh and the box model (with topology). We could probably create an API that just gives us the model.

KennethEJansen commented 5 years ago

I have spent some time looking over apfBox.h and apfBox.cc. I couldn't get a beachhead of understanding. Then I compiled core/test/box.cc which uses this and then walk through the debugger to get a better idea of how the information is "built up". I THINK that we can manually fill the struct modelTable. It seems to be a master list that, for each entity, gives its dim (dimension) and tag. I am fairly sure it "walks" the model as if it was a lattice. If we call whatever coordinate we number in first "x", The "entity" at 0,0,0 (first entity it entity 0) is model vertex 0 (dim=0 tag=0), walking half of L_x leaves us on a model edge so model entity 1 is a model edge (dim=1, tag 0....0 because it is the first edge). Continue walking in "x" and we land at (L_x,0,0) which is entity 2, a model vertex (dim=0, tag 1...second model vertex).

Next we increment j by 1 and bring x back to zero so (0,L_y/2,0) is entity 3 (dim=1, tag=1...second model edge). Holding y fixed and moving in x brings us to (L_x/2, L_y/2,0) which is entity 4 which is model face 0 (dim=2, tag=0 as this is the first face). More succinctly, (Lx,L_y/2,0) entity 5 (dim =1, tag=2) (0,L_y,0) entity 6 (dim =0, tag=2) (Lx/2,L_y,0) entity 7 (dim =1, tag=3) (Lx,L_y,0) entity 8 (dim =9, tag=3) That covers the first "face" of the box and so we finally increment in the third counting direction (0,0,L_z/2) entity 9 (dim =1, tag=4) (Lx/2,0,L_z/2) entity 10 (dim =2, tag=1) (second face on the bottom) (Lx,0,L_z/2) entity 11 (dim =1, tag=5) (0,L_y/2,L_z/2) entity 12 (dim =2, tag=2) (Lx/2,L_y/2,L_z/2) entity 13 (dim =3, tag=0) (model region e.g., the "center" of the lattice ) (Lx,L_y/2,L_z/2) entity 14 (dim =2, tag=3) (0,L_y,L_z/2) entity 15 (dim =1, tag=6) (Lx/2,L_y,L_z/2) entity 16 (dim =2, tag=4) (Lx,L_y,L_z/2) entity 17 (dim =1, tag=7)

There are 9 more entities at L_z (4 model vertices, 4 model edges and a model face) but I imagine the pattern is clear.

I THINK that if we modify this code, and put our "tags" in the the right places, we should be able to use the rest of the model building part of the code as is because apfBox.cc calls formModelTable to build the modelTable data structure and then calls buildModel which uses this (only?) data structure to build the model and I assume create the adjacencies. It is unclear to me if the order really matters because I cannot see where that is used but I would play it safe and keep them in this lattice order.

Actually, I suppose we could use formModelTable with one line of code changed. Line 88 changes to

'int nd[4] = {31,11,1,0};'

that is, start numbering vertices from 31 instead of 0, edges from 11, and faces from 1 (this may force you to change your mesh classification list to follow this lattice numbering convention).

Cameron, can you comment on where we should do this model building operation in your code? I ask because I think you said we already have a model region and I am thinking we probably want to form this table (with our tags) and make a hacked version of this code to use that table to do what we see in buildModel (which is called from line 80 ) BEFORE or IN PLACE of what you are currently doing to create a model, right?

Looking at your code I assume that is in deriveMdsModel. Is this compatible?

cwsmith commented 5 years ago

If we could get this to produce a box model (possibly via a modified buildModel() ) then we want to use that call instead of creating a null model here:

https://github.com/cwsmith/core/blob/d0132907a9bd96263167551c39285c720c97b25c/test/matchedNodeElmReader.cc#L346

KennethEJansen commented 5 years ago

I am confident that we can do that. However, do with then need to comment out apf::deriveMdsModel(mesh)

From what I read in that function (or actually mds_derive_model that it redirects to), it looks like this function expects a null model.

I am also a little fuzzy about the difference between a tag and a model id. At the top of the code we have

22 / tags on vertices / 23 #define INTERIORTAG 0 24 #define FACE 1 25 #define FACE_LAST 6 26 #define EDGE 11 27 #define EDGE_LAST 22 28 #define VERTEX 31 29 #define VERTEX_LAST 38 30 31 / model entity ids / 32 #define INTERIOR_REGION 1

I feel like INTERIOR_REGION should be 0 because later we say

34 apf::ModelEntity getMdlRgn(gmi_model model) { 35 apf::ModelEntity rgn = reinterpret_cast<apf::ModelEntity>( 36 gmi_find(model, 3, INTERIOR_REGION)); 37 PCU_ALWAYS_ASSERT(rgn); 38 return rgn; 39 } 40 41 apf::ModelEntity getMdlFace(apf::Mesh2 mesh, int tag) { 42 apf::ModelEntity* face = mesh->findModelEntity(2,tag); 43 PCU_ALWAYS_ASSERT(face); 44 return face; 45 }

other entities are passing in a tag and using the tag for findModelEntity.

cwsmith commented 5 years ago

Yes, we need to remove the deriveMdsModel(...) call if we are using buildModel(...).

Suppose we have created a model with buildModel(...) and we know the integers associated with each geometric model entity in that model (via the changes to formModelTable(...), IIUC). Then, knowing the map of the integers that have been attached via an apf::MeshTag to the geometric model entities (the vtxClass tag ), we can call mdlEnt = gmi_find(..., <geometric model id>) and mesh->setModelEntity(meshEnt,mdlEnt) to set classification.

The buildModel(...) will dictate what the id of the only geometric model region is. I suspect it sets it to zero.

KennethEJansen commented 5 years ago

A few more questions: 1) As noted above, I am confident that the code chunk (lines 68 to 84) from apfBox.cc will do what I want. BUT 2) these routines use a lot of helper functions in apfBox.cc (and have structs defined in apfBox.h) SO 3) I am wondering the best way to do this. Is it enough to include apfBox.h as header in your code and then insert these 4 lines of code (line numbers from apfBox.cc)

439 // gmi_model model = gmi_load(".null"); 440 int nx=3; 441 int ny=3; 442 int nz=3; 443 mgrid(nx ? 3 : 1, ny ? 3 : 1, nz ? 3 : 1); 444 formModelTable(); 445 gmi_model model = buildModel();

in place of your current null model call? 4) I guess I am asking if including the header is enough to make the functions and data structs from apfBox.cc "available" to the code we are developing? I went ahead and tried it and of course it failed.

/projects/tools/SCOREC-core/core/test/matchedNodeElmReader.cc:443:43: error: 'mgrid' was not declared in this scope mgrid(nx ? 3 : 1, ny ? 3 : 1, nz ? 3 : 1); ^ /projects/tools/SCOREC-core/core/test/matchedNodeElmReader.cc:444:18: error: 'formModelTable' was not declared in this scope formModelTable(); ^ /projects/tools/SCOREC-core/core/test/matchedNodeElmReader.cc:445:33: error: 'buildModel' was not declared in this scope gmi_model* model = buildModel();

If you want to take a look at this, is on the viz nodes and I will leave my VNC open to it

kjansen@viz003: /projects/tools/SCOREC-core/btest $ vi ../core/test/matchedNodeElmReader.cc kjansen@viz003: /projects/tools/SCOREC-core/btest that is a build directory and a path to source I was trying to modify. Note I did also hack apfBox.cc to use our (non zero) starting tag counters for each entity type.

cwsmith commented 5 years ago

I think we'll have to make a version of the BoxBuilder constructor:

https://github.com/SCOREC/core/blob/7ba0ce41fee3369b8fa8f0233799552ab721d5d4/mds/apfBox.cc#L68-L84

without these lines:

https://github.com/SCOREC/core/blob/7ba0ce41fee3369b8fa8f0233799552ab721d5d4/mds/apfBox.cc#L81-L83

KennethEJansen commented 5 years ago

Yep that was the first thing I tried. My C++ is too weak though and I hit a wall. The example box.cc calls

69 apf::Mesh2 m = apf::makeMdsBox(nx,ny,nz,wx,wy,wz,is); 70 gmi_model g = m->getModel();

That is , it gets the model from the mesh and in our case we are doing the mesh from file AFTER the model is created. That is why I was trying to get at the interior functions that don't depend on the mesh.

cwsmith commented 5 years ago

How about running the 'box' test driver to generate a .dmg model of the necessary dimensions, throwing away the mesh that is also produced, and then changing the model entity numbering scheme written by @rickybalin's mesh generation script to match the model entity numbering from apfBox.cc ?

To build the 'box' driver:

make -j box

then run it with:

./test/box

to see the input args.

In matchedNodeElmReader.cc the resulting .dmg would be loaded instead of the 'null' model and the deriveMdsModel(...) would be removed

KennethEJansen commented 5 years ago

We could try this. I have already run box in the debugger to understand how it "worked". A slight mod to what you are saying is to make the one line change to box so that it starts its tag numbers from 31, 11, 1, and 0 so that we actually get the same tag ranges. This is important because we are using these tag ranges to do the classification. I will try this quickly.

Can you also comment on whether there is any difference between a tag numer and a model entity id in the code you started and we took over (e.g., lines 22 to 32 ish)?

cwsmith commented 5 years ago

OK. Sounds good.

Can you also comment on whether there is any difference between a tag numer and a model entity id in the code you started and we took over (e.g., lines 22 to 32 ish)?

They are just integer ids. If the ints attached to the mesh vertices are exactly the same as the model entity ids then the lines

https://github.com/cwsmith/core/blob/d0132907a9bd96263167551c39285c720c97b25c/test/matchedNodeElmReader.cc#L21-L31

are not needed and the values stored in the mesh entity tag can be passed to the gmi_find(...) call to get the model entity.

KennethEJansen commented 5 years ago

This built and is failing in verify now so I think that means we still have work to do with classification. Riccardo, I tested this in your VNC 2582 mpirun -np 1 /projects/tools/SCOREC-core/btest/test/box 5 5 5 1 2 3 0 modbox.dmg meshbox/ 2583 mpirun -np 2 /projects/tools/SCOREC-core/btest/test/matchedNodeElmReader box_small.cnn box_small.crd NULL box_small.class model mesh/ rbalin@viz003: ~/NSF_DataDrivenTurbMod/PrelimTests/matchedNodeElmReader/box/kejTest

in a subdirectory from where you were testing. The output suggests we have the classification wrong, perhaps simply because the faces and edges dictated by my hack of box, don't match your file e.g., numVerts 27 isMatched 0 seconds to create mesh 0.002 APF FAILED: apf::Verify: vertex with 3 adjacent edges centroid: (0, 0, 0) based on the following:

vertex is classified on a model region we would expect the adjacent edge count to be at least 4

[viz003:29720] Process received signal APF FAILED: apf::Verify: vertex with 3 adjacent edges centroid: (1, 0, 0) based on the following:

vertex is classified on a model region we would expect the adjacent edge count to be at least 4

I think, 0,0,0 should be classified on a model vertex right since your crd says you are a unit box (0, 1) in each of three directions.

KennethEJansen commented 5 years ago

I am in the debugger and finding that before I execute line 182 I have what looks like valid information in a totalview dive on model. After I step past line 182 I get "bad address" when I dive on model. I guess that means that something in the getMdlRgn function is corrupting the model?

181 void setRgnClassification(gmi_model model, apf::Mesh2 mesh) { 182 apf::ModelEntity mdlRgn = getMdlRgn(model); 183 apf::MeshIterator it = mesh->begin(3); 184 apf::MeshEntity rgn; 185 while( (rgn = mesh->iterate(it)) ) 186 mesh->setModelEntity(rgn,mdlRgn); 187 mesh->end(it); 188 } 189 190 void setClassification(gmi_model model, apf::Mesh2 mesh, apf::MeshTag t) { 191 setRgnClassification(model,mesh); 192 setFaceClassification(model,mesh,t); 193 setEdgeClassification(model,mesh,t); 194 setVtxClassification(model,mesh,t); 195 mesh->acceptChanges(); 196 }

Tracking this a bit deeper (I missed that getMdlRgn is our function, the corruption happens immediately on the dive into getMdlRgn. I don't understand well enough what happens in line 36-37: 35 apf::ModelEntity getMdlRgn(gmi_model model) { 36 apf::ModelEntity rgn = reinterpret_cast<apf::ModelEntity>( 37 gmi_find(model, 3, INTERIOR_REGION)); 38 PCU_ALWAYS_ASSERT(rgn); 39 return rgn; 40 } but totalview takes as a step from line 182 to line 37. Before the step the model is a valid address. After it, it says bad address. Note that when I go back up to line 182 it is bad there too so it is not miss-passed information, rather it is what we are doing on line 36 and 37 that is corrupting the model. I am going to need help understanding wth line 36 and 37 were supposed to do before there is any hope of fixing them.

BELOW Might not be relevant now.

I wonder if we are missing something significant still. I did comment out the call to derive model BUT don't we have to do SOMETHING to associate the model with the mesh? In rough terms it seem we are: 1) reading the model 2) reading the coordinates and connectivity 3) using apf:construct to build the mesh in the mds form 4) setting the coordinates 5) applying classiification

Is it step 5) that does the model/mesh association?

cwsmith commented 5 years ago

I assume that the model object that is returned by gmi_load("/path/to/the/dmg/from/box") is being passed to the create empty mesh call?

https://github.com/cwsmith/core/blob/d0132907a9bd96263167551c39285c720c97b25c/test/matchedNodeElmReader.cc#L347

KennethEJansen commented 5 years ago

Probably should have created a new comment since I am not sure updates go out. Issue is line 36 and 37. Here is what I did for the model load:

443 gmi_model* model = gmi_load("modbox.dmg");

there are no complaints about this in the following code 452 apf::Mesh2 mesh = apf::makeEmptyMdsMesh(model, m.dim, isMatched); 453 apf::GlobalToVert outMap; 454 apf::construct(mesh, m.elements, m.localNumElms, m.elementType, outMap); 455 delete [] m.elements; 456 apf::alignMdsRemotes(mesh); 457 // apf::deriveMdsModel(mesh); 458 /for (int i=0; i<81; i++) { 459 std::cout<<m.coords[i]<<std::endl; 460 }/ 461 apf::setCoords(mesh, m.coords, m.localNumVerts, outMap); 462 delete [] m.coords; 463 if( isMatched ) { 464 apf::setMatches(mesh, m.matches, m.localNumVerts, outMap); 465 mesh->acceptChanges(); 466 delete [] m.matches; 467 } 468 apf::MeshTag t = setIntTag(mesh, m.classification, 1, 469 m.localNumVerts, outMap); 470 outMap.clear(); 471 setClassification(model,mesh,t);

but within setClassification, the corruption on line 36 and 37 is what I am concerned about

SCOREC / core

creating large matched meshes from connectivity and coordinates #205