MPAS-Dev / MPAS-Tools

MPAS Tools Repository
Other
39 stars 65 forks source link

Switch from `rasterio` to `inpoly` for signed distance masks #420

Closed xylar closed 3 years ago

xylar commented 3 years ago

inpoly is likely much faster, is easier to use, and requires far fewer dependencies. inpoly also is aware of detecting points exactly on the boundary of the polygon, eliminating the need to expand the polygon to capture points right on the boundary.

Both the affine and rasterio dependencies are removed as they are no longer needed.

rasterio is not yet compatible with the latest proj, which is causing trouble for updating mpas_tools to the latest libnetcdf (it's a long story...). A new release with this update will eliminate that problem.

xylar commented 3 years ago

Testing

I made the base SOwISC12tto60 mesh with rasterio (old way) and inpoly (new way).

rasterio: cellWidthGlobal

inpoly: cellWidthGlobal

I didn't bother to compute timing because computing the mask is trivial compared to computing distance (both before and after this change). But I did take a careful look at the mask in paraview and it looks exactly right.

Here is a small test script I also used in my debugging:

#!/usr/bin/env python
import numpy
import xarray

from geometric_features import read_feature_collection

from mpas_tools.mesh.creation.signed_distance import mask_from_geojson, \
    signed_distance_from_geojson
from mpas_tools.io import write_netcdf
from mpas_tools.cime.constants import constants

if __name__ == '__main__':

    fc = read_feature_collection('high_res_region.geojson')

    dlon = 0.1
    dlat = 0.1
    nlon = int(360./dlon) + 1
    nlat = int(180./dlat) + 1

    lon = numpy.linspace(-180., 180., nlon)
    lat = numpy.linspace(-90., 90., nlat)

    earth_radius = constants['SHR_CONST_REARTH']

    signed_distance = signed_distance_from_geojson(fc, lon, lat, earth_radius,
                                                   max_length=0.25)

    mask = mask_from_geojson(fc, lon, lat)

    ds = xarray.Dataset()
    ds['lon'] = ('lon', lon)
    ds['lat'] = ('lat', lat)
    ds['mask'] = (('lat', 'lon'), mask.astype(int))
    ds['signed_distance'] = (('lat', 'lon'), signed_distance)
    write_netcdf(ds, 'mask.nc')

It uses high_res_region.geojson.

xylar commented 3 years ago

@dengwirda, all I'm hoping for in a review is a quick visual inspection that nothing looks obviously wrong or potentially problematic with my usage of inpoly2 here. If you want to run tests of your own, feel free.

dengwirda commented 3 years ago

@xylar, I also tested with the full region.geojson file:

import time
import numpy
import xarray

from geometric_features import read_feature_collection

from mpas_tools.mesh.creation.signed_distance import mask_from_geojson
from mpas_tools.io import write_netcdf

if __name__ == '__main__':

    fc = read_feature_collection('region.geojson')

    dlon = 0.1
    dlat = 0.1
    nlon = int(360./dlon) + 1
    nlat = int(180./dlat) + 1

    lon = numpy.linspace(-180., 180., nlon)
    lat = numpy.linspace(-90., 90., nlat)   

    ttic = time.time()
    mask = mask_from_geojson(fc, lon, lat)
    ttoc = time.time()
    print("mask:", ttoc - ttic)

    ds = xarray.Dataset()
    ds['lon'] = ('lon', lon)
    ds['lat'] = ('lat', lat)
    ds['mask'] = (('lat', 'lon'), mask.astype(int))
    write_netcdf(ds, 'mask.nc')

which looks reasonable enough: mask although is a little different (at the boundaries) compared to the rasterio approach (new - old): diff It looks like the polygons are slightly "fatter" (i.e. more land) in the rasterio approach?

xylar commented 3 years ago

@dengwirda, yes, with rasterio, we had to expand polygons slightly because they did not include the boundaries (i.e. in did not include on), which was a problem.

xylar commented 3 years ago

@dengwirda, if you're happy, please approve so I can merge and do a release tomorrow.

dengwirda commented 3 years ago

@xylar, how important is it to maintain consistency across MPAS-Tools updates here? There are definitely some differences, this is a zoom of the amazon delta (inpoly, rasterio, region.geojson): mask_zoom It looks like the inpoly workflow is capturing the detail in the coastlines correctly, so I'm happy there, but if it's necessary to maintain closer consistency with the current MPAS-Tools I think we may need to look further?

xylar commented 3 years ago

@dengwirda, I appreciate you being so thorough in your review. I do not believe that masking is currently used in any workflow where the exact details of the shape are relevant. For example, it is not currently used to handle the coastlines, that is another routine. Instead, the current usage as far as I'm aware is for the sign part of signed distance functions and in a few cases for defining boundaries between basins, in which case the location of the boundary is intentionally placed over land so the exact location isn't important.

What is more, we do not try to guarantee that meshes are reproducible across versions of MPAS-Tools (or mpas_tools, the python package). Indeed, if we also make updates to jigsaw and jigsaw-python soon, that won't be the case for sure. So I think this is fine.

xylar commented 3 years ago

Thanks, @dengwirda!