corteva / rioxarray

geospatial xarray extension powered by rasterio
https://corteva.github.io/rioxarray
Other
525 stars 83 forks source link

DOC: Example of converting 2D coordinates to regular grid #209

Open snowman2 opened 3 years ago

snowman2 commented 3 years ago

Your data has unevenly spaced 2D lat/lon values.

That means that either:

  1. Your data is not geographic and was re-projected to lat/lon in the 2D space to preserve the coordinate locations.
  2. Your data is not represented in an evenly spaced grid.

If (1) is your problem, you will need to re-project your data from 2D latlon back to the projected coordinates in 1D, add the 1D x&y coordinates to your dataset. Then it should work.

If (2) is your problem, then you will need to decide on the resolution of your grid, the bounds of your area of interest, and resample your data to the new regularly spaced grid. Then it will work.

From: https://github.com/corteva/rioxarray/issues/47#issuecomment-532377611

snowman2 commented 3 years ago

There are several options for the case where the data is unevently spaced and you want to convert it to a regular grid;

snowman2 commented 3 years ago

geocube: Step 1: https://gis.stackexchange.com/questions/384581/raster-to-geopandas/384691#384691

rds = xarray.open_dataset("path_to_file.nc")
df = rds.squeeze().to_dataframe().reset_index()
geometry = gpd.points_from_xy(df.x, df.y)
gdf = gpd.GeoDataFrame(df, crs=rds.rio.crs, geometry=geometry)

Step 2: https://corteva.github.io/geocube/stable/examples/rasterize_point_data.html

from geocube.api.core import make_geocube
from geocube.rasterize import rasterize_points_griddata

geo_grid = make_geocube(
    vector_data=gdf,
    resolution=(-0.1, 0.1),
    rasterize_function=rasterize_points_griddata,
)
Zepy1 commented 3 years ago

There are several options for the case where the data is unevently spaced and you want to convert it to a regular grid;

* `pyresample`: https://pyresample.readthedocs.io/en/latest/swath.html (maybe @djhoese could assist here?)

* The `rasterize_points_` functions powered by scipy: https://github.com/corteva/geocube/blob/master/geocube/rasterize.py

* `gdal.Grid`: https://gdal.org/python/osgeo.gdal-module.html#Grid

Also the Xoak library seems promising to perform point-wise selection of irregularly spaced data : https://xoak.readthedocs.io/en/latest/examples/introduction.html

snowman2 commented 3 years ago

See: #202

snowman2 commented 10 months ago

Related #724

j08lue commented 2 months ago

For case 2 of the above - if you are dealing with a curvilinear lat/lon grid, you may want to check out the regridding pro tool, xESMF:

import xarray as xr
import xesmf as xe
import numpy as np
import rioxarray

ds = xr.open_dataset("curvilinear.nc")

dlon = dlat = 0.01
target_grid = {
    "lon": np.arange(ds["longitude"].min(), ds["longitude"].max() + dlon, dlon, dtype="float32"), 
    "lat": np.arange(ds["latitude"].min(), ds["latitude"].max() + dlat, dlat, "float32")
}

regridder = xe.Regridder(ds, target_grid, "bilinear", unmapped_to_nan=True)
ds_regridded = regridder(ds)

da_regridded = ds_regridded[VARIABLE_NAME]

da_regridded.rio.write_crs("epsg:4326", inplace=True)
da_regridded.rio.set_spatial_dims("lon", "lat", inplace=True)
da_regridded.rio.to_raster("rectilinear.tif")
j08lue commented 2 months ago

Here is an example reprojecting from 2D to 1D (regular) lat/lon grid using Rasterio 1.4 with src_geoloc_array and reconstructing a nice Xarray DataArray again:

NB: This requires Rasterio 1.4 (currently in beta: pip install rasterio==1.4b1)

import xarray as xr
import numpy as np
import rasterio
import rasterio.warp
import rioxarray
from rioxarray.rioxarray import affine_to_coords

PATH = "tas_rcp85_land-cpm_uk_2.2km_01_day_19801201-19811130.nc"
PARAMETER_NAME = "tas"
WGS84 = rasterio.crs.CRS.from_epsg(4326)

ds = xr.open_dataset(PATH)
lon2d = ds["longitude"].values
lat2d = ds["latitude"].values
source = ds[PARAMETER_NAME].isel(ensemble_member=0).values

src_height, src_width = lon2d.shape

dst_transform, dst_width, dst_height = rasterio.warp.calculate_default_transform(
    src_crs=WGS84,
    dst_crs=WGS84,
    width=src_width,
    height=src_height,
    src_geoloc_array=(lon2d, lat2d),
)

destination = np.full((len(source), dst_height, dst_width), np.nan)

data, transform = rasterio.warp.reproject(
    source,
    destination=destination,
    src_crs=WGS84,
    dst_crs=WGS84,
    dst_transform=dst_transform,
    dst_nodata=np.nan,
    src_geoloc_array=np.stack((lon2d, lat2d))
)

coords = affine_to_coords(transform, width=dst_width, height=dst_height, x_dim="x", y_dim="y")
coords.update(time=ds["time"])

filtered_attrs = ds[PARAMETER_NAME].attrs.copy()
filtered_attrs.pop("grid_mapping", None)

da = xr.DataArray(data, coords=coords, dims=("time", "y", "x"), name=PARAMETER_NAME, attrs=filtered_attrs)

As notebook here: https://gist.github.com/j08lue/e792b3c912c33e9191734af7e795b75c

This obviously makes a ton of assumptions - that your data is in float dtype, what the original 2D lat/lon fields are called.

@snowman2, you were probably looking for a solution that uses rioxarray in the first place, not only for affine_to_coords, right?

snowman2 commented 2 months ago

you were probably looking for a solution that uses rioxarray in the first place

I would like to add geolocation support to rio.reproject at some point when I get time. Though, I wouldn't mind if someone beat me to it.

j08lue commented 2 months ago

What kind of interface do you have in mind?

How about:

rio.reproject(..., src_geoloc_array=("lon_2d", "lat_2d"))  # names of 2d coordinate variables in the Dataset
rio.reproject(..., src_geoloc_array=(lon_2d, lat_2d))  # 2d coordinate arrays
rio.reproject(..., src_geoloc_array=True)  # apply some heuristics to guess the arrays - not sure that kind of "magic" is appropriate for rioxarray

?

snowman2 commented 2 months ago

names of 2d coordinate variables

I would rather pass on this idea presently.

2d coordinate arrays

rio.reproject(..., src_geoloc_array=(lon_2d, lat_2d))  # 2d coordinate arrays

Yes, this should be the simplest to implement and I would definitely like to see this work.

apply some heuristics to guess the arrays - not sure that kind of "magic" is appropriate for rioxarray

It would be great to have it automatically identify if only 2D coordinate arrays are present and use them. To do this, there would need to be code to search for 2D coordinate arrays.

snowman2 commented 1 week ago

See #822