visit-dav / visit

VisIt - Visualization and Data Analysis for Mesh-based Scientific Data
https://visit.llnl.gov
BSD 3-Clause "New" or "Revised" License
428 stars 111 forks source link

Variable sum vs Mesh volume #4156

Closed rusu24edward closed 4 years ago

rusu24edward commented 4 years ago

Describe the bug

This stems from #4063. I was able to duplicate the user's bug with our own data. See below for some analysis:

To Reproduce

  1. Checkout my branch origin/bugfix/rusu1/visit-4063-variable-volume-wrong.
  2. Launch visit with ucd3d.silo
  3. Recreate the following tables: Screen Shot 2019-12-05 at 1 34 55 PM

The data in "from Verdict" table is from the terminal, which comes from vtkVerdictExpression.

In this dataset, it looks like using the mesh volume is correct and that variable summation is wrong (because of the sumFromOriginalElement flag).

Is there an example where using sumFromOriginalElement produces the correct result?

markcmiller86 commented 4 years ago

Is there an example where using sumFromOriginalElement produces the correct result?

Can you maybe point to the body of code where the logic involving this variable is triggering the behavior?

markcmiller86 commented 4 years ago

Oops. Sorry, I guess I can go find it on your branch.

rusu24edward commented 4 years ago

My branch just has some debugging outputs. The variable gets activated here and is involved in the summation process here

biagas commented 4 years ago

My commit adding the 'sumFromOriginalElement' is from 2006. I looked up the commit log and found this explanation:

This fixes '6968, Variable sum giving wrong answer --
due to a couple of problems:

1)  When zones split by Slicing, Clipping, MIR, etc,
    the resultant sub-cells were each contributing to the
    sum -- bad because when split, the zonal value is passed
    intact to to the resultant cells, so you may have multiple
    cells contributing value v, where it should be only 1.

    I made the summation query utilize OriginalCell information
    to ensure each 'original' cell contributes only 1 value
    to the sum.

2)  Under same conditions as above, for nodal data, nodes may
    have been duplicated, or left in the dataset even though
    no longer a part of the resulting mesh.  I removed
    non-relevant points, and ensure that if duplicate points
    had been created (from same original node), that they
    contribute only once to the sum.

I added a flag to avtDataValidity, so that the Variable Sum query
could know when zones had been split (not simply 'invalidated',
like with OnionPeel and Threshold).

The code using this flag has changed since the original addition, changing the conditions under which the flag is set to true. I don't know who added the new code or the reasoning behind it. Since the 'bDoCustomFiltering' flag that helps determine when sumFromOriginalElement is set to true is based off avtDataValidity settings, it is quite possible that the instances where these flags are set have changed considerably as well.

markcmiller86 commented 4 years ago

Ok, so something is a bit odd about the dataset used here. Material 9 consists of 1200 hexahedra. However, only 400 of those have non-zero volume. If I select material 9 and do a variable sum of volume, I get zero, which is wrong. If I first apply threshold on volume with min of 0.001, then only 400 zones result and variable sum of those 400, non-zero, volumes does yield the correct volume for material 9 whereas without the threshold operator removing the 800 degenerate hexahedra which seem to define the inner and outer layers of material 9, I get a value of zero. I wonder if those degenerate hexes are causing negative volumes to get counted?

markcmiller86 commented 4 years ago

I wonder if those degenerate hexes are causing negative volumes to get counted?

No, they aren't. Those degenerate hexahedra are all zero volume (or very tiny positive). None are negative volume.

biagas commented 4 years ago

I tried finding information on the old bug '6968 to figure out what scenario prompted the use of 'sumFromOriginalElement', but the redmine stuff is gone. Based on my comment, I assume variable sum queries of data that had been clipped/sliced was erroneous. Some filters changed at the same time to report that they were 'splitting cells': Contour, CracksClipper, Cone, cylinder, Isovolume, SphereSlice, ThreeSlice, Hope this information helps.

markcmiller86 commented 4 years ago

So, I am seeing potentially multiple issues here; the original one that triggered this investigation and then others as a result of investigating the original issue.

A program logical variable name like sumFromOriginalElement seems like it should be impacted by whether the associated mesh quantity is an intensive of extensive quantity. If it is intensive variable, its value does not change when the material is broken (e.g. think original mesh cells being split) into smaller and smaller pieces. Density is a good example. If MIR decomposes a cell into several smaller pieces, each piece gets the same (original cell's) density.

Now, for a quantity like volume, its value does indeed change as we break cells into smaller pieces and so summing the original pieces for a volume result does make sense to me.

markcmiller86 commented 4 years ago

If MIR decomposes a cell into several smaller pieces, each piece gets the same (original cell's) density.

In truth, MIR is a special case because the original cell, if it is mixing is NOT homogenous in material. The one scalar density we know for the cell (typically) is an average of the densities of the mixing materials and, yes, when MIR splits it, each will get a different density (the density variable's mixed values) when it is split.

markcmiller86 commented 4 years ago

Attached here find a histogram plot of mesh_quality/Volume from ucd3d.silo. The red, green and blue plots are for each of the materials, 1, 4 and 9, selected individually, respectively. The brown plot is when no materials are selected.

Untitled

There are a few issues with this plot. One odd thing is all the zero volumes that result when material 9 (blue) is the only one selected. MIR appears to be creating layers of degenerate hexahedra.

There is mixing going on in this dataset. However, the mixing is only between materials 1 and 4 and does not include 9. You can plainly see this in the image below.

Untitled 2
biagas commented 4 years ago

Perhaps an explanation of the use of original cells in the summation is in order (when sumFromOriginalElment is true):

For each cell processed, if it's Original cell value has already been used in the summation, then it is skipped. So in the case of 1 cell being split into 4 pieces, only 1 of the pieces will contribute to the summation. This is definitely wrong if we are looking at volume, but correct for density.

So, we need yet another mechanism for determining if a the variable being summed should use the sumFromOriginalElement flag or not.

rusu24edward commented 4 years ago

Thank you @markcmiller86 and @biagas for your input here. I want to summarize what I've read to make sure I understand correctly:

When a zone is split, the variables in the original zone needs to be correctly applied to the new zones. If the variable is intrinsic/intensive, then the correct thing to do is to just take the variable exactly. For example, the density of the original zone is the same as the density of the clipped zones. If the variable is extrinsic/extensive, then the correct thing to do is to weigh the value of the variable by the volume of the zones. For example, the mass of the original zone should be normalized to the density by the volume of the original zone and then renormalized to the mass by the volume of the new zones. This seems to be the general idea, with some special treatment if the zone is mixed material.

Visit creates a new mesh variable called mesh_quality/volume. When we go to sum this variable, Visit doesn't know that it's a volume and should just be directly summed. In order to be general, it has to ask if the variable should be summed from the original value, and because of the MIR, it triggers true.

It then seems like there is some issue with the mapping back to the original element or the logic that uses it. If the mapping worked correctly, then it seems to me that the calculated volume would be greater than the correct volume because the volume of zoo elements from a split are <= the volume of the original element. However, we are seeing that the calculated volume is actually less than the correct volume, indicating an issue with the mapping or the logic of it.

rusu24edward commented 4 years ago

Note to self: can add intrinsic/extrinsic switch as a windowtype in QvisQueryWindow::UpdateArgumentPanel().

rusu24edward commented 4 years ago

It then seems like there is some issue with the mapping back to the original element or the logic that uses it. If the mapping worked correctly, then it seems to me that the calculated volume would be greater than the correct volume because the volume of zoo elements from a split are <= the volume of the original element. However, we are seeing that the calculated volume is actually less than the correct volume, indicating an issue with the mapping or the logic of it.

avtSummartionQuery::Execute looks at the value in the variable, not at the value of the original variable. Here's the flow using mesh_quality/volume with zones:

for every zone in the dataset:
    value = mesh_quality/volume[zone] # Note, NOT the original volume
    original_zone = the_original_zone_from[zone]
    add original_zone to a set
    if the original_zone is already in the set:
        then ignore the value
    else:
        add the value

So there are a lot of volume values being left out because of the original_zone logic, and since the value is not the volume from the original zone, it is less than the true value.

One possible fix is to try to get the volume of the original zone, although I don't know at all how easy/difficult this would be.

markcmiller86 commented 4 years ago

Note to self: can add intrinsic/extrinsic switch as a windowtype in QvisQueryWindow::UpdateArgumentPanel().

I think that might be useful in cases where the database variables themselves don't already report their intrinsic-ness or extrinsic-ness. However, in Silo anyways, data producers can already indicate whether a variable is extrinsic or intrinsic and this should simply be captured in metadata (maybe avtBaseVarMetaData.[Ch] and then used to control logic such as this.

At the same time, when the database doesn't indicate it, there should be a means for a user to tell VisIt how to treat a variable.

rusu24edward commented 4 years ago

Another thought: when does it make sense to sum an intrinsic variable? It makes sense to me to average it, but what about just a raw sum?

markcmiller86 commented 4 years ago

I am not sure I would put a lot of weight into the difference between a sum and an avarage. They differ only by a constant factor. And, is your question specific to the case of handling the parts of some larger mesh element that was decomposed (somehow, either by MIR or arbitrary polyedral decomp into zoo elements)?

rusu24edward commented 4 years ago

I'm not so much asking about the numerical difference between a sum and an average. I'm thinking about it from "what a user would reasonably do." When is it reasonable to take the sum of an intrinsic variable? What physical meaning does that have? For example, what does the sum of all the densities physically mean?

rusu24edward commented 4 years ago

As per the discussion today (12/19/19), this ticket is a good starting point for the bigger picture outlined in #4202. mesh_quality/volume is intuitively extrinsic, so we can capture that information in the meta data of a variable. The variable sum can look at that data to determine who to do the summation.

rusu24edward commented 4 years ago

Note to self: plan of attack: add volume-dependency variable to avtExpressionDataTreeIterator. avtVerdictExpression inherits from this and can modify that variable. Then save the meta data into vtkDataSet or avtDataRepresentation at the end of avtExpressionDataTreeIterator::ExecuteData. Then grab this meta variable from the dataset in avtSummationQuery and use that determine how to sum. Still need to investigate exactly how to add the meta data.

rusu24edward commented 4 years ago