openearth / aeolis-python

A process-based model for simulating supply-limited aeolian sediment transport
http://aeolis.readthedocs.io/
GNU General Public License v3.0
34 stars 26 forks source link

Integrate Reading Geotiffs for Improved Remote Sensing Data Integration #183

Open frederikvand opened 4 months ago

frederikvand commented 4 months ago

Description

Currently, the integration of remote sensing data in aeolis-python requires preprocessing where the geospatial metadata and preprocessing steps (e.g. rotation) are not stored in the output netCDF file. This limits both the implementation potential for digital twins and the compatibility of the resulting NetCDF with existing geospatial packages. I propose to make it possible to read geotiff files, extract geospatial metadata, and initialize the model state variables using the geotiff data.

Solution Overview

To improve the integration of remote sensing data, I propose the following enhancements:

  1. Integrate functions to read geotiff files and extract geospatial metadata. This will allow reading the arrays of geotiff-based digital elevation models (DEMs) and other remote sensing data. I would propose to integrate functionality to correct for precision errors in the resolution for robustness.
  2. Integrate a function to calculate the x and y grid coordinates based on the geotiff metadata. This will ensure accurate spatial representation of the input data.
  3. Add a configuration file parameter for specifying the path to the DEM geotiff file.
  4. Pass the three grids (x, y, zb) to the constants (initial state) if the DEM path is present in the configuration. This will initialize the model state with the actual spatial coordinates and elevation data from the geotiff.
  5. Add the true x and y grid coordinates to the output netCDF file. Add the geospatial metadata to the output netCDF file.
  6. Ensure that the output netCDF file is compatible with the xarray transform function xds["zb"].rio.to_raster('DEM_prediction.tif'). This will enable posthoc conversion for backwards compatibility.

Required Functions

The following functions would be needed to implement the proposed solution:

def read_geotiff_as_array(geotiff_path):
    """
    Reads a single-band geotiff file into a NumPy array.

    Parameters:
    - geotiff_path: String. Path to the geotiff file.

    Returns:
    - NumPy array containing the raster data from the first band.
    """
    with rasterio.open(geotiff_path) as dataset:
        # Read the raster data as a NumPy array
        raster_data = dataset.read(1)  # Reads the first band
        return raster_data

def extract_geotiff_metadata(geotiff_path):
    """
    Extracts spatial metadata from a geotiff file, dynamically addressing floating-point precision errors.

    Parameters:
    - geotiff_path: Path to the geotiff file.

    Returns:
    - A dictionary containing the CRS, transform, dynamically adjusted cell resolution for precision, and geotiff dimensions.
    """
    with rasterio.open(geotiff_path) as geotiff:
        # Dynamically determine the number of decimal places based on the resolution
        # This finds the first significant decimal place in the resolution
        significant_figures = -int(math.floor(math.log10(abs(geotiff.res[0]))))
        # Add additional precision to accommodate floating-point representation
        precision = significant_figures + 2

        # Address precision issue by rounding the resolution
        ds = round(geotiff.res[0], precision)
        dn = round(geotiff.res[1], precision)

        # Round the resolution components of the transform
        a = round(geotiff.transform.a, precision)  # Resolution in X direction
        e = round(geotiff.transform.e, precision)  # Resolution in Y direction (typically negative)

        # Create a new transform with the adjusted resolution
        # Note: This assumes no rotation; adjust accordingly if your geotiffs may have rotation
        adjusted_transform = Affine(a, geotiff.transform.b, geotiff.transform.c,
                                    geotiff.transform.d, e, geotiff.transform.f)

        crs = geotiff.crs
        geotiff_data = geotiff.read(1)
        ny, nx = geotiff_data.shape

        # Calculate grid cell surface area and its inverse
        dsdn = ds * abs(dn)
        dsdni = 1 / dsdn

    # Return the adjusted metadata
    return {
        'crs': crs,
        'transform': adjusted_transform,
        'ds': ds,  # Adjusted for flexible precision
        'dn': dn,  # Adjusted for flexible precision
        'dsdn': dsdn,  # Real-world grid cell surface area
        'dsdni': dsdni,  # Inverse of real-world grid cell surface area
        'nx': nx,
        'ny': ny,
        'zb': geotiff_data  # Include the geotiff data in the metadata for further processing
    }

def calculate_grid_coordinates(transform, nx, ny):
    # Calculate the center coordinates for each pixel
    x = np.arange(nx) * transform.a + transform.c + (transform.a / 2)  # center of pixel
    y = np.arange(ny) * transform.e + transform.f + (transform.e / 2)  # center of pixel

    return x, y

These above functions can be integrated into utils.py to enable reading geotiffs, extracting metadata, and initializing the model state with the spatial information.

The following variables in MODEL_STATE constants.py would be derived from the DEM geotiff:

('ny', 'nx') : (
    'x',                                # [m] Real-world x-coordinate of grid cell center
    'y',                                # [m] Real-world y-coordinate of grid cell center
    'ds',                               # [m] Real-world grid cell size in x-direction
    'dn',                               # [m] Real-world grid cell size in y-direction
    'dsdn',                             # [m^2] Real-world grid cell surface area
    'dsdni',                            # [m^-2] Inverse of real-world grid cell surface area
    'zb',                               # [m] Bed level above reference
)

Finally, the following variables should be passed to the netCDF file in netcdf.py when the simulation data is written to make it compatible to geospatial packages:

# store spatial metadata
nc.variables['x'][:,:] = s['x']
nc.variables['y'][:,:] = s['y']
nc.variables['lat'][:,:] = s['lat']  # Add latitude values
nc.variables['lon'][:,:] = s['lon']  # Add longitude values
nc.variables['x_bounds'][:,:,:] = s['x_bounds']  # Add x-coordinate bounds
nc.variables['y_bounds'][:,:,:] = s['y_bounds']  # Add y-coordinate bounds
nc.variables['lat_bounds'][:,:,:] = s['lat_bounds']  # Add latitude bounds
nc.variables['lon_bounds'][:,:,:] = s['lon_bounds']  # Add longitude bounds

# store geotiff metadata
nc.variables['crs'].spatial_ref = s['crs'].to_string()  # Add CRS as spatial reference string
nc.variables['crs'].GeoTransform = ' '.join(map(str, s['transform']))  # Add GeoTransform attribute

# store model state variables derived from geotiff
nc.variables['zb'][:,:] = s['zb']  # Bed level
nc.variables['ds'][:] = s['ds']  # Grid cell size in x-direction
nc.variables['dn'][:] = s['dn']  # Grid cell size in y-direction
nc.variables['dsdn'][:] = s['dsdn']  # Grid cell surface area
nc.variables['dsdni'][:] = s['dsdni']  # Inverse of grid cell surface area