visit-dav / visit

VisIt - Visualization and Data Analysis for Mesh-based Scientific Data
https://visit.llnl.gov
BSD 3-Clause "New" or "Revised" License
438 stars 116 forks source link

Diablo Mili plot file crashes when trying to view shared variables #16993

Open durrenberger1 opened 3 years ago

durrenberger1 commented 3 years ago

Jerome Solberg created a plot file using Diablo which has gap analysis for bricks and and particles. These become shared variables. When trying to do a psuedocolor plot selecting the shared variables the server crashes, sometimes on initial drawing other when I move a timestep. I can supply you with with the debug logs up to level 3 and the plot files on the RZ if needed. This is the error that occurred.

error

brugger1 commented 3 years ago

Hi Kevin,

What version of VisIt are you using? The default on rzgenie is 3.1.4, but we have 3.2.1 installed, which has some mili improvements over 3.1.4.

If you’re using the latest version, then yes, the files and the level 3 debug logs would be helpful.

Eric

durrenberger1 commented 3 years ago

Hi Eric,

I tried it both the default 3.1.4 and version 3.2.1 I got the same result.

I will give you the files on the RZ.

Kevin

brugger1 commented 3 years ago

@durrenberger1, I did some debugging and the crash is happening because the data from the mili file is being written outside of the array that it is storing the data into based on the metadata in the mili file.

Here is my overview of the mesh topology.

There is a strip of 10 hexes on top. Below that is a strip of 10 shells on the lower boundary of the 10 hexes. Below that is a strip of 10 shells. Below that is a strip of 10 hexes on the bottom. The shells line up with the top surfaces of the hexes.

This give 40 elements total. 10 hexes, 10 shells, 10 shells, 10 hexes.

The variable we are trying to plot is "gap" and it is a element quantity that is only defined on the 2 sets of 10 shells.

It writes the first set of 10 values offset into the array of 40, 10 spots in. This is correct. What I believe should happen is the next set of 10 values should be written into the array of 40, offset 20 spots. What is happening is that it is writing 22 values offset by 155.

Here is the information about "gap" in the mili file:

        "gap": {
            "LongName": "Nodal Gap",
            "num_type": 4,
            "agg_type": 1,
            "vector_size": 3,
            "dims": 0,
            "rank": 1,
            "Center": 1,
            "VTK_TYPE": 2,
            "vector_components": [
                "gap_x",
                "gap_y",
                "gap_z"
            ],
            "subrecords": [
                15,
                16
            ]
        }

Note the that it references subrecords 15, 16. Here is the information that VisIt has read in about the subrecords 15 and 16.

AddSubrec: SRId=15,numEl=10,numDB=1
dBRAnges=11,20,
AddSubrec: SRId=15,numEl=10,numDB=1
dBRAnges=11,20,
AddSubrec: SRId=15,numEl=10,numDB=1
dBRAnges=11,20,
AddSubrec: SRId=15,numEl=10,numDB=1
dBRAnges=11,20,
AddSubrec: SRId=15,numEl=10,numDB=1
dBRAnges=11,20,
AddSubrec: SRId=16,numEl=22,numDB=1
dBRAnges=155,176,
AddSubrec: SRId=16,numEl=22,numDB=1
dBRAnges=155,176,
AddSubrec: SRId=16,numEl=22,numDB=1
dBRAnges=155,176,
AddSubrec: SRId=16,numEl=22,numDB=1
dBRAnges=155,176,
AddSubrec: SRId=16,numEl=22,numDB=1
dBRAnges=155,176,

The really strange this is that subrec 15 has 10 values, implying an element value, while subrec 16 has 22 values, implying a nodal value (10 connected shells give 11 rows of 2 nodes, giving 22 nodes total). So the problem is that the values for gap are zonal for the first 10 shells and are nodal for the second 10 shells, which seems like it was unintended and won't work.

Subrec 16 should have 10 elements and a range of 21 - 30, assuming gap is an element value. I noticed that the long name is "Nodal Gap", so maybe the intent was for it to be a nodal variable. I'm not sure I know what the correct values would be in this case, but subrec 15 would definitely be wrong and subrec 16 might still have the incorrect range.

Thoughts on my analysis?

markcmiller86 commented 3 years ago

Analysis seems great. I assume by this...

It writes the first set of 10 values offset into the array of 40, 10 spots in. This is correct.

You are meaning that the Mili plugin is writing these values to a vtk data array to be returned from the plugin to VisIt in a GetVar call?

It could be helpful if you could point to the lines of code you are talking about by going to the source. I assume you are talking about code at or around here, https://github.com/visit-dav/visit/blob/09309bfa35b2c4f14765c17284571645cec0a07a/src/databases/Mili/avtMiliFileFormat.C#L1508-L1528

What does being a "shared variable" mean?

The error message pasted in orig. post shows its plotting a variable named Primal/Shared/gap_magnitude so that suggests its a vector variable and VisIt is computing the magnitude using expressions. So, is it maybe a 3-component quantity involved?

Can the data file be attached here?

brugger1 commented 3 years ago

@markcmiller86 I am creating a pseudocolor plot of Primal/Shared/gap_magnitude as you mentioned, so it is reading in gap, which is a vector. It is then calculating the magnitude with an expression.

Regarding the location in the code. That spot is a bit higher in the call chain. The actual overwrite is located in the call to ReadMiliResults in this block of code: https://github.com/visit-dav/visit/blob/09309bfa35b2c4f14765c17284571645cec0a07a/src/databases/Mili/avtMiliFileFormat.C#L2279-L2292

The important variable values are:

start = 0
blockRanges[0] = 155
nTargetEl = 22
varSize = 3

I believe they should be:

start = 0
blockRanges[0] = 20
nTargetEl = 10
varSize = 3

The length of the array dataBuffer is length 120, so with blockRanges[0]=155, it writes outside the array bounds.

durrenberger1 commented 3 years ago

Hi Eric,

So shared variables were added fairly recently to the plugin. If you have element which have a common variables say stress for bricks and shells they would be combined into a Shared menu item. We do this within Griz as we do not have the ability to render more than one element/variable otherwise. Alister was working on this prior to leaving to do the AI work he is currently doing.

This was to give analyst a familiar structure that they see in Griz.

The additional 'gap' elements for subrecord 16 are actually particles which resolve to nodal values for visit

brugger1 commented 3 years ago

That additional information helps a lot! Especially that the additional gap elements are particles. If I understood the Mili file format I could probably figure this out myself, but it would be useful to know what the total nodes are for the problem and then all the elements for the problem.

My guess is that there are 10 hexes, 10 shells, 10 shells, 10 hexes, but I don't know how the particles fit in. Based on the block offsets I'm seeing, I assume there are more things going on, but I'm not sure.

Maybe a WebEx with @durrenberger1 , @markcmiller86 and I would be more efficient at this point.

durrenberger1 commented 3 years ago

We can meet. But, I need to get with Diablo and see what they are trying to do. There are 2 particles per node and they used the long name Nodal which in itself won't effect things but it does not make sense. I need to know what they are trying to accomplish.

Also at the bottom of the .mili file is a section called Classes which list all the class info. Here is that section:

"Classes": { "count": 6, "node": { "LongName": "Nodal", "ElementCount": 88, "SuperClass": 1, "variables": [ "nodpos", "nodacc", "temp", "nodvel", "disp_iter", "noddisp" ] }, "quad": { "LongName": "Quads", "ElementCount": 20, "SuperClass": 5, "variables": [ "nodforce", "nodnormpen", "normgap", "gap", "nodpres" ] }, "brick": { "LongName": "Bricks", "ElementCount": 20, "SuperClass": 9, "variables": [ "stress", "eeff" ] }, "mat": { "LongName": "Material", "ElementCount": 0, "SuperClass": 10 }, "glob": { "LongName": "Global", "ElementCount": 0, "SuperClass": 11 }, "particle": { "LongName": "Nodal", "ElementCount": 176, "SuperClass": 13, "variables": [ "nodforce", "dx", "dy", "dz", "nodnormpen", "normgap", "gap", "nodpres" ] } }

markcmiller86 commented 3 years ago

Maybe a WebEx with @durrenberger1 , @markcmiller86 and I would be more efficient at this point.

I can make myself available for that.

durrenberger1 commented 3 years ago

Let me work with Diablo first. There is some things not correct with the output of the plot file.

Kevin

From: markcmiller86 @.> Sent: Tuesday, September 7, 2021 11:52 AM To: visit-dav/visit @.> Cc: Durrenberger, James K. @.>; Mention @.> Subject: Re: [visit-dav/visit] Diablo Mili plot file crashes when trying to view shared variables (#16993)

Maybe a WebEx with @durrenberger1https://urldefense.us/v3/__https:/github.com/durrenberger1__;!!G2kpM7uM-TzIFchu!lSSYm-MmA-BVE94WEmq3oLk1smp6IsUTmUS-qTvwP24RQekGz292VkpRO1kY158lbk-a$ , @markcmiller86https://urldefense.us/v3/__https:/github.com/markcmiller86__;!!G2kpM7uM-TzIFchu!lSSYm-MmA-BVE94WEmq3oLk1smp6IsUTmUS-qTvwP24RQekGz292VkpRO1kY1-Pqc-lN$ and I would be more efficient at this point.

I can make myself available for that.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://urldefense.us/v3/__https:/github.com/visit-dav/visit/issues/16993*issuecomment-914543722__;Iw!!G2kpM7uM-TzIFchu!lSSYm-MmA-BVE94WEmq3oLk1smp6IsUTmUS-qTvwP24RQekGz292VkpRO1kY1-U1U4cl$, or unsubscribehttps://urldefense.us/v3/__https:/github.com/notifications/unsubscribe-auth/AKX6GRC7XLZX6VR6DXPFCSTUAZNM3ANCNFSM5DHUIHXQ__;!!G2kpM7uM-TzIFchu!lSSYm-MmA-BVE94WEmq3oLk1smp6IsUTmUS-qTvwP24RQekGz292VkpRO1kY1_8jnT4D$. Triage notifications on the go with GitHub Mobile for iOShttps://urldefense.us/v3/__https:/apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675__;!!G2kpM7uM-TzIFchu!lSSYm-MmA-BVE94WEmq3oLk1smp6IsUTmUS-qTvwP24RQekGz292VkpRO1kY1yWJBRyX$ or Androidhttps://urldefense.us/v3/__https:/play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign*3Dnotification-email*26utm_medium*3Demail*26utm_source*3Dgithub__;JSUlJSU!!G2kpM7uM-TzIFchu!lSSYm-MmA-BVE94WEmq3oLk1smp6IsUTmUS-qTvwP24RQekGz292VkpRO1kY1xez_zAx$.

durrenberger1 commented 3 years ago

Diablo is working on some changes to their output that I noted to them were incorrect. However we will need to make a decision on Shared variables. Right now, I am thinking to remove them from visit since you are able to view multiple variables in Visit. That is the main reason we have them in Griz is that they could only visualize one element variable combination. If the had hexes and shells both with stress the analyst had to pick one or the other to view without the Shared category.

But let Diablo finish fixing their output first.

markcmiller86 commented 3 years ago

Right now, I am thinking to remove them from visit since you are able to view multiple variables in Visit. That is the main reason we have them in Griz is that they could only visualize one element variable combination. If the had hexes and shells both with stress the analyst had to pick one or the other to view without the Shared category.

IMHO, unless I am not understanding the use case w.r.t. shared variables, I don't see any reasons VisIt shouldn't be able to handle these mostly naturally.

durrenberger1 commented 3 years ago

I think the issue will be with boundary conditions in Diablo. They will have Particles and quads which they use for boundary conditions and have the same variable. The particles resolve to nodes currently and quads of course are elements, but they share common variables. My understanding it that the mix of the Nodal particles and quad elements would cause issues with the variable data request. However I could be wrong. Most likely I am wrong.

markcmiller86 commented 3 years ago

Maybe the issue is that VisIt thinks of point-like mesh objects as nodes when it should really be either nodes or elements. In the case of a mesh consisting solely of points (e.g. a piont mesh), the nodes are synonymous with the elements. They are one in the same.

I suppose that could also mean we can have both node- and zone-centered variables on a point-mesh in VisIt. We may currently restrict that to only node-centered.

So a (shared) variable, foo, defined on the nodes of quads+particles and a (shared) variable, bar defined on the elements of quads+particles, foo should think of itself as node-centered on particles and bar should think of itself as zone-centered on particles.

There may be a current (and unnecessary) restriction in VisIt which prevents that. If so, we should relax it.

brugger1 commented 3 years ago

VisIt can handle unstructured meshes with both all types of elements including point elements, so you can have a single mesh with hexes, quads, lines and points. If the variables were all zone-centered, then a point element could have a different value at the points from the shells that share nodes with the points.

durrenberger1 commented 3 years ago

Alright, this makes sense. When Diablo is done fixing their output we can revisit.

Kevin

From: Eric Brugger @.> Sent: Monday, September 13, 2021 3:42 PM To: visit-dav/visit @.> Cc: Durrenberger, James K. @.>; Mention @.> Subject: Re: [visit-dav/visit] Diablo Mili plot file crashes when trying to view shared variables (#16993)

VisIt can handle unstructured meshes with both all types of elements including point elements, so you can have a single mesh with hexes, quads, lines and points. If the variables were all zone-centered, then a point element could have a different value at the points from the shells that share nodes with the points.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://urldefense.us/v3/__https:/github.com/visit-dav/visit/issues/16993*issuecomment-918637043__;Iw!!G2kpM7uM-TzIFchu!j_AUSMpRPI5k9D0EnHjePcdGZA7runC-OcNTuZ1dHW9QTs43phdh1RUmKfRrXiuVOdln$, or unsubscribehttps://urldefense.us/v3/__https:/github.com/notifications/unsubscribe-auth/AKX6GRGDSAMZ32AMCVVOG23UBZ445ANCNFSM5DHUIHXQ__;!!G2kpM7uM-TzIFchu!j_AUSMpRPI5k9D0EnHjePcdGZA7runC-OcNTuZ1dHW9QTs43phdh1RUmKfRrXvetjPmI$. Triage notifications on the go with GitHub Mobile for iOShttps://urldefense.us/v3/__https:/apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675__;!!G2kpM7uM-TzIFchu!j_AUSMpRPI5k9D0EnHjePcdGZA7runC-OcNTuZ1dHW9QTs43phdh1RUmKfRrXvefIE-_$ or Androidhttps://urldefense.us/v3/__https:/play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign*3Dnotification-email*26utm_medium*3Demail*26utm_source*3Dgithub__;JSUlJSU!!G2kpM7uM-TzIFchu!j_AUSMpRPI5k9D0EnHjePcdGZA7runC-OcNTuZ1dHW9QTs43phdh1RUmKfRrXnxfP1_h$.