multiply-org / multiply-core

The core functionality of the MULTIPLY platform
2 stars 5 forks source link

Resampling data throughout MULTIPLY platform #6

Open tramsauer opened 6 years ago

tramsauer commented 6 years ago

Should there be a central method of resampling input/intermediate(prior)/output data to avoid double coding and preserve consistency throughout the platform?

Passing on (e.g. prior) data as NetCDF files would be beneficial, as the resampling would be very easy. But, read/write operations should be limited. @JorisTimmermans mentioned that there is an in-memory netcdf-file-container format which could be used.

What are your thoughts on this @TonioF ?

jgomezdans commented 6 years ago

NetCDF has very poor support for projections. Howevever, if you manage to get the projection recognised by GDAL (v>2), it is then trivial to resample/reproject things on the fly to just extract the relevant part . The following function does just that and returns the re-sampled/reprojected dataset


import gdal
import osr
def reproject_image(source_img, target_img, dstSRSs=None):
    """Reprojects/Warps an image to fit exactly another image.
    Additionally, you can set the destination SRS if you want
    to or if it isn't defined in the source image."""
    g = gdal.Open(target_img)
    geo_t = g.GetGeoTransform()
    x_size, y_size = g.RasterXSize, g.RasterYSize
    xmin = min(geo_t[0], geo_t[0] + x_size * geo_t[1])
    xmax = max(geo_t[0], geo_t[0] + x_size * geo_t[1])
    ymin = min(geo_t[3], geo_t[3] + y_size * geo_t[5])
    ymax = max(geo_t[3], geo_t[3] + y_size * geo_t[5])
    xRes, yRes = abs(geo_t[1]), abs(geo_t[5])
    if dstSRSs is None:
        dstSRS = osr.SpatialReference()
        raster_wkt = g.GetProjection()
        dstSRS.ImportFromWkt(raster_wkt)
    else:
        dstSRS = dstSRSs
    g = gdal.Warp('', source_img, format='MEM',
                  outputBounds=[xmin, ymin, xmax, ymax], xRes=xRes, yRes=yRes,
                  dstSRS=dstSRS)
    return g```
TonioF commented 6 years ago

Currently, we do not yet have such a common code repository, but it becomes clearer that it is a good idea to have one. I support the idea of not re-using code snippets. It usually leads to inconsistencies when something is change din one place but not in the other, and we certainly want to avoid that. Was that netcdf supporting container xarray, @JorisTimmermans ?

Now, to be more specific: I am not sure what resampling is required in multiple components. Is it like the one @jgomezdans posted?

jgomezdans commented 6 years ago

So how to make NetCDF files GDAL compliant.... Suppose we have a netcdf file called stuff.nc, with two datasets sigma0_VV and sigma0_VH. We also assume that these (raster) data are stored in WGS84/LatLong. To make the data file projection information understandable by GDAL, we can use the nco tools to

ncap -h -O -s 'crs=-9999' stuff.nc stuff.nc 
ncatted -h -O \
    -a spatial_ref,crs,c,c,'GEOGCS[\"GCS_WGS_1984\",DATUM[\"WGS_1984\",SPHEROID[\"WGS_84\",6378137.0,298.257223563]],PRIMEM[\"Greenwich\",0.0],UNIT[\"Degree\",0.017453292519943295]]' \
    -a grid_mapping_name,crs,c,c,'latitude_longitude' \
    -a longitude_of_prime_meridian,crs,c,d,0 \
    -a semi_major_axis,crs,c,d,6378137 \
    -a inverse_flattening,crs,c,d,298.257223563 \
    -a grid_mapping,sigma0_VV,c,c,'crs' \
    -a grid_mapping,sigma0_VH,c,c,'crs' \
    stuff.nc

Calling gdalinfo shows how the magic has fared...

$ gdalinfo 'NETCDF:"stuff.nc":sigma0_VH'
Driver: netCDF/Network Common Data Format
Files: stuff.nc
Size is 1682, 1122
Coordinate System is:
GEOGCS["GCS_WGS_1984",
    DATUM["WGS_1984",
        SPHEROID["WGS_84",6378137.0,298.257223563]],
    PRIMEM["Greenwich",0.0],
    UNIT["Degree",0.017453292519943295]]
Origin = (11.599997254990416,48.340028605146351)
Pixel Size = (0.000124865824493,-0.000124865824493)
Metadata:
  crs#grid_mapping_name=latitude_longitude
  crs#inverse_flattening=298.257223563
  crs#longitude_of_prime_meridian=0
  crs#semi_major_axis=6378137
  crs#spatial_ref=GEOGCS["GCS_WGS_1984",DATUM["WGS_1984",SPHEROID["WGS_84",6378137.0,298.257223563]],PRIMEM["Greenwich",0.0],UNIT["Degree",0.017453292519943295]]
  lat#long_name=latitude
  lat#standard_name=latitude
  lat#units=degrees_north
  lat#valid_max=48.33996617223411
  lat#valid_min=48.19999158297789
  lon#long_name=longitude
  lon#standard_name=longitude
  lon#units=degrees_east
  lon#valid_max=11.80995913887475
  lon#valid_min=11.60005968790266
  NC_GLOBAL#Conventions=CF-1.4
  NC_GLOBAL#start_date=01-JAN-2017 16:59:02.631879
  NC_GLOBAL#stop_date=01-JAN-2017 16:59:04.505051
  NC_GLOBAL#TileSize=564:564
  sigma0_VH#coordinates=lat lon
  sigma0_VH#grid_mapping=crs
  sigma0_VH#_FillValue=0
Geolocation:
  LINE_OFFSET=0
  LINE_STEP=1
  PIXEL_OFFSET=0
  PIXEL_STEP=1
  SRS=GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AUTHORITY["EPSG","4326"]]
  X_BAND=1
  X_DATASET=NETCDF:"stuff.nc":lon
  Y_BAND=1
  Y_DATASET=NETCDF:"stuff.nc":lat
Corner Coordinates:
Upper Left  (  11.5999973,  48.3400286) ( 11d35'59.99"E, 48d20'24.10"N)
Lower Left  (  11.5999973,  48.1999292) ( 11d35'59.99"E, 48d11'59.74"N)
Upper Right (  11.8100216,  48.3400286) ( 11d48'36.08"E, 48d20'24.10"N)
Lower Right (  11.8100216,  48.1999292) ( 11d48'36.08"E, 48d11'59.74"N)
Center      (  11.7050094,  48.2699789) ( 11d42'18.03"E, 48d16'11.92"N)
Band 1 Block=841x561 Type=Float32, ColorInterp=Undefined
  NoData Value=0
  Metadata:
    coordinates=lat lon
    grid_mapping=crs
    NETCDF_VARNAME=sigma0_VH
    _FillValue=0

@multiply-org/multiply-developers @GerardoLopez

TonioF commented 6 years ago

We now have the common code repository in multiply-core. If there is a need to put any function there - such as the one from Josés post - that might be beneficial to more than one component we might put it there. For now, I have added the AttributeDict that can easily read configs.