Kitware / pan3d

Other
10 stars 2 forks source link

PFlotran Example #72

Open johnkit opened 5 months ago

johnkit commented 5 months ago

PFLOTRAN is a hydrology code used by scientists at DOE Labs for watershed analysis and related research. It has a custom results/output format that is not directly compatible with Xarary. Using a "reference file system", we can access PFlotran data. I am adding an example here as a starting point for some additional features we should consider developing.

To reproduce the screenshot.

1. Create a python venv and install these packages

2. Run the python script pasted below and pass in the attached json file, e.g.

> python test_small.py  pfsmall-gcp.json

pfsmall-gcp.json

import argparse
import sys

import pyvista as pv
import pvxarray
import xarray as xr

if __name__ == '__main__':
    parser = argparse.ArgumentParser()
    parser.add_argument('ref_filesystem', help='path (or url) of reference file system (.json)')
    args = parser.parse_args()

    ds = xr.open_dataset('reference://', engine='zarr', backend_kwargs={
        'consolidated': False,
        'storage_options': {'fo': args.ref_filesystem}
    })
    print(ds)

    var_list = list(ds.keys())
    if not var_list:
        print('No data arrays')
        sys.exit()
    print(var_list)

    var = 'Total_Fe+++ [M]'
    if var not in var_list:
        print(f'Cannot display {var}')
        sys.exit()

    tcoord = ds.coords.get('time')
    last_time_index = len(tcoord) - 1

    da = ds[var]
    dat = da[dict(time=last_time_index)]
    print(dat)

    # All important transpose (which copies data - sigh)
    dat_T = dat.transpose()

    print()
    mesh = dat_T.pyvista.mesh(x='x_cell [m]', y='y_cell [m]', z='z_cell [m]')
    print(mesh)
    p = pv.Plotter()
    p.add_mesh_clip_plane(mesh, show_edges=False)
    p.add_axes()
    p.show()

FYI the PFlotran data file is an hdf5 file stored in Google Cloud bucket. It is a small (67MB) subset of a much larger dataset that Dipankar at Berkely Lab provided to us last year.

Candidate features/updates

From my notes Friday 22-Mar-2024 meeting with Dipankar and Chuyang

johnkit commented 5 months ago

I made a new script that more closely matches the intended visualization. It has 2 main differences from the original script (above)

Also note there is a new json file to pass in to the script: gcp.json

import argparse
import sys

import pyvista as pv
import pvxarray
import xarray as xr

DEFAULT_VARIABLE='Total_Fe+++ [M]'
ADD_BLANKING_ARRAY = True

if __name__ == '__main__':
    parser = argparse.ArgumentParser()
    parser.add_argument('ref_filesystem', help='path (or url) of reference file system (.json)')
    parser.add_argument('-v', '--variable', default=DEFAULT_VARIABLE, \
        help=f'variable to display [{DEFAULT_VARIABLE}]')

    parser.add_argument('-o', '--output_vtr_filename', help='write grid to .vtr file')
    parser.add_argument('-c', '--clip_plane', action='store_true', help='add clip plane to plot')
    args = parser.parse_args()

    # Must load with reference file system
    # And MUST disable mask_and_scale in order to input Material_ID (int32) correctly
    ds = xr.open_dataset('reference://', engine='zarr', mask_and_scale=False, backend_kwargs={
        'consolidated': False,
        'storage_options': {'fo': args.ref_filesystem}
    })
    print(ds)

    var_list = list(ds.keys())
    if not var_list:
        print('No data arrays')
        sys.exit()
    print(var_list)

    var_name = args.variable
    if var_name not in var_list:
        print(f'Cannot display {var_name}')
        sys.exit()

    print()
    # Create grid from point coordinates
    xg = ds['X [m]'].values
    yg = ds['Y [m]'].values
    zg = ds['Z [m]'].values
    grid = pv.RectilinearGrid(xg, yg, zg)
    # print(dir(grid))

    def update_grid(time_index):
        """Set grid's cell data to specified time"""
        t = time_index if isinstance(time_index, int) else int(time_index + 0.5)
        da = ds[var_name]
        var_data = da[t].values.transpose().flatten()  # MUST TRANSPOSE
        grid.cell_data.set_array(var_data, var_name)
    update_grid(0)

    if ADD_BLANKING_ARRAY:
        import numpy as np
        def blank_cell(mat_id):
            return 32 if mat_id < 1 else 0
        vblank_cell = np.vectorize(blank_cell, otypes=[np.uint8])

        # We are presuming blanking doesn't change w/time
        da_mat = ds['Material_ID'][0]
        np_blank = vblank_cell(da_mat).transpose().flatten()  # MUST TRANSPOSE
        grid.cell_data.set_array(np_blank, 'vtkGhostType')

    if args.output_vtr_filename is not None:
        from vtkmodules.vtkIOXML import vtkXMLRectilinearGridWriter
        writer = vtkXMLRectilinearGridWriter()
        writer.SetInputData(grid)
        writer.SetFileName(args.output_vtr_filename)
        writer.Write()
        print(f'Wrote {args.output_vtr_filename}')

    p = pv.Plotter()
    if args.clip_plane:
        p.add_mesh_clip_plane(grid, scalars=var_name, show_edges=False)
    else:
        p.add_mesh(grid, scalars=var_name, show_edges=False)
    p.add_axes()
    p.add_slider_widget(update_grid, [0.1, 5.1], fmt='%0.0f', value=0, title='Time Step')
    p.show()