yt-project / yt

Main yt repository
http://yt-project.org
Other
461 stars 276 forks source link

Incorrect pixelization for some FLASH data #1361

Closed ngoldbaum closed 7 years ago

ngoldbaum commented 7 years ago

Bug summary

See http://lists.spacepope.org/pipermail/yt-users-spacepope.org/2017-May/008602.html and replies.

For some reason the pixelization picks up "streaky", incorrect values along some grid boundaries.

Code for reproduction

import yt
ds = yt.load('lasslab_hdf5_plt_cnt_0000.hdf5')
plot = yt.SlicePlot(ds, 'z', 'density')
plot.set_log('density', False)
plot.save('nogrids')
plot.annotate_grids()
plot.save('grids')

The dataset is available at http://use.yt/upload/364c7039

Actual outcome

The script produces these images:

grids_slice_z_density nogrids_slice_z_density

Expected outcome

The horizontal streaks at y = -170, 0, 170, 400, and 570 are artifacts. When we actually look at the raw field values, none of them are close to ~0.4, which is the value in the artifacts:

In [2]: np.unique(ad['gas', 'density'])
Out[2]:
YTArray([  9.99999975e-06,   1.75001249e-01,   2.00000003e-01,
         1.04400003e+00,   1.29999995e+00]) g/cm**3

So there are values near zero, near 0.2, and near 1, but none near 0.4.

Version Information

ngoldbaum commented 7 years ago

On a suggestion from Matt, I turned antialias off by hand:

diff --git a/yt/geometry/coordinates/cartesian_coordinates.py b/yt/geometry/coordinates/cartesian_coordinates.py
index 479dac4..b0e2ab1 100644
--- a/yt/geometry/coordinates/cartesian_coordinates.py
+++ b/yt/geometry/coordinates/cartesian_coordinates.py
@@ -141,6 +141,7 @@ class CartesianCoordinateHandler(CoordinateHandler):
                                      nodal_data, coord, bounds, int(antialias),
                                      period, int(periodic))
         else:
+            antialias=0
             pixelize_cartesian(buff, data_source['px'], data_source['py'],
                                data_source['pdx'], data_source['pdy'],
                                data_source[field],

Which seems to fix the issue:

nogrids_slice_z_density

So there seems to be a corner case in the handling of antialias in pixelize_cartesian.

cevans216 commented 7 years ago

I was looking into this issue this morning as it also effects an Einstein Toolkit frontend that I'm working on. It looks like the issue is that some of the cells from the YTSlice overlap, and therefore the line buff[i,j] += (dsp * overlap1) * overlap2 in pixelize_cartesian does not work as intended.

A quick fix is to also keep track of a running 'weight' in pixelize_cartesian that is just given by overlap1*overlap2, and scale the result at the end:

diff --git a/yt/geometry/coordinates/cartesian_coordinates.py b/yt/geometry/coordinates/cartesian_coordinates.py
index 7fdc6b49b..14a7b0eb6 100644
--- a/yt/geometry/coordinates/cartesian_coordinates.py
+++ b/yt/geometry/coordinates/cartesian_coordinates.py
@@ -221,6 +221,7 @@ class CartesianCoordinateHandler(CoordinateHandler):
             period = period.in_units("code_length").d

         buff = np.zeros((size[1], size[0]), dtype="f8")
+        weight = np.zeros(buff.shape, buff.dtype) if antialias else None

         finfo = self.ds._get_field_info(field)
         nodal_flag = finfo.nodal_flag
@@ -233,12 +234,12 @@ class CartesianCoordinateHandler(CoordinateHandler):
                                      nodal_data, coord, bounds, int(antialias),
                                      period, int(periodic))
         else:
-            pixelize_cartesian(buff, data_source['px'], data_source['py'],
+            pixelize_cartesian(buff, weight, data_source['px'], data_source['py'],
                                data_source['pdx'], data_source['pdy'],
                                data_source[field],
                                bounds, int(antialias),
                                period, int(periodic))
-        return buff
+        return np.divide(buff, weight) if antialias else buff

     def _oblique_pixelize(self, data_source, field, bounds, size, antialias):
         indices = np.argsort(data_source['pdx'])[::-1].astype(np.int_)
diff --git a/yt/utilities/lib/pixelization_routines.pyx b/yt/utilities/lib/pixelization_routines.pyx
index 20840afbc..4f31440cf 100644
--- a/yt/utilities/lib/pixelization_routines.pyx
+++ b/yt/utilities/lib/pixelization_routines.pyx
@@ -57,6 +57,7 @@ cdef extern from "pixelization_constants.h":
 @cython.boundscheck(False)
 @cython.wraparound(False)
 def pixelize_cartesian(np.float64_t[:,:] buff,
+                       np.float64_t[:,:] weight,
                        np.float64_t[:] px,
                        np.float64_t[:] py,
                        np.float64_t[:] pdx,
@@ -227,6 +228,7 @@ def pixelize_cartesian(np.float64_t[:,:] buff,
                                 # compositing instead of replacing bitmaps.
                                 if overlap1 * overlap2 == 0.0: continue
                                 buff[i,j] += (dsp * overlap1) * overlap2
+                                weight[i,j] += overlap1*overlap2
                             else:
                                 buff[i,j] = dsp

This isn't ideal as it would require changing calls to pixelize_cartesian elsewhere, but it does help illustrate the problem. Looking at the weight array after pixelize_cartesian returns shows values varying from 1-3 for this dataset.

ngoldbaum commented 7 years ago

Thanks for taking a look! I was also looking at this today. I will try to look closer at this tomorrow during the SciPy sprints.

ngoldbaum commented 7 years ago

Also that's awesome to hear about the Einstein toolkit frontend. Have you talked with Jonah Miller? He was working on a frontend based on the SimulationIO framework. Is this the same thing or a different effort?

cevans216 commented 7 years ago

A different effort, but inspired by Jonah's. I liked the final product but wanted to cut out the middle man of converting Carpet output to SimulationIO to make the whole process more amenable to quick scripting. It's pretty close to done, just needs a bit more polish. I'm hoping to put in a pull request soon.

ngoldbaum commented 7 years ago

Awesome, to be honest I was always skeptical about the extra dependency. Very much looking forward to reviewing the PR.

ngoldbaum commented 7 years ago

So it looks like this is an issue with masking:

In [6]: (ad['dx']*ad['dy']*ad['dz']).sum()
Out[6]: 0.020275604248046877 code_length**3

In [7]: (ds.domain_right_edge - ds.domain_left_edge).prod()
Out[7]: 0.0182 code_length**3

(this is for the lasslab dataset in the issue)

Those numbers should be equal. Low resolution cells should be getting masked out but aren't.

ngoldbaum commented 7 years ago

I identified a bad grid:

In [25]: grid = ds.index.grids[2051]

In [26]: grid.child_mask.T
Out[26]:
array([[[False, False, False, False, False, False, False, False, False,
         False, False, False, False, False, False, False],
        [False, False, False, False, False, False, False, False, False,
         False, False, False, False, False, False, False],
        [False, False, False, False, False, False, False, False, False,
         False, False, False, False, False, False, False],
        [False, False, False, False, False, False, False, False, False,
         False, False, False, False, False, False, False],
        [False, False, False, False, False, False, False, False, False,
         False, False, False, False, False, False, False],
        [False, False, False, False, False, False, False, False, False,
         False, False, False, False, False, False, False],
        [False, False, False, False, False, False, False, False, False,
         False, False, False, False, False, False, False],
        [False, False, False, False, False, False, False, False, False,
         False, False, False, False, False, False, False],
        [ True,  True,  True,  True,  True,  True,  True,  True,  True,
          True,  True,  True,  True,  True,  True,  True],
        [False, False, False, False, False, False, False, False, False,
         False, False, False, False, False, False, False],
        [False, False, False, False, False, False, False, False, False,
         False, False, False, False, False, False, False],
        [False, False, False, False, False, False, False, False, False,
         False, False, False, False, False, False, False],
        [False, False, False, False, False, False, False, False, False,
         False, False, False, False, False, False, False],
        [False, False, False, False, False, False, False, False, False,
         False, False, False, False, False, False, False],
        [False, False, False, False, False, False, False, False, False,
         False, False, False, False, False, False, False],
        [False, False, False, False, False, False, False, False, False,
         False, False, False, False, False, False, False]]], dtype=bool)

In [27]: grid.Children
Out[27]:
[FLASHGrid_2053 ([16 16  1]),
 FLASHGrid_2082 ([16 16  1]),
 FLASHGrid_2095 ([16 16  1]),
 FLASHGrid_2124 ([16 16  1])]

In [28]: grid.Level
Out[28]: 1

Note how one row in the child mask is True, yet this grid has 4 children. In the FLASH AMR structure for a 2D sim, a grid with 4 children should always be completely masked out (i.e. all entries in grid.child_mask should be False). So I guess the next place to look is to see how the child_mask for this grid is getting computed.

ngoldbaum commented 7 years ago

I opened https://github.com/yt-project/yt/pull/1493 which should address this issue for FLASH data. @cevans216 I think the issue you're running into with your frontend might be a similar issue with masking. I would check to make sure the domain volume and the sum over the inidividual cell volume match.

cevans216 commented 7 years ago

Thanks for the guidance on that @ngoldbaum. I found my issue as well and it was indeed similar. I had made an off by one error converting the vertex-centered Carpet data to cell-centered data. As a result, the last fcoord of each grid was at grid.RightEdge + grid.dds/2. (instead of grid.RightEdge - grid.dds/2.) overlapping with the first fcoord of the adjacent grid and duplicating data.