OpenDrift / opendrift

Open source framework for ocean trajectory modelling
https://opendrift.github.io
GNU General Public License v2.0
250 stars 120 forks source link

netCDF structured domains with masked coordinates lead to pixelsize == 0 #1436

Open lencart opened 3 weeks ago

lencart commented 3 weeks ago

Framing

While writing a structured netCDF reader for Delft3D-Flow outputs, I found out that the pixel size and bounding box are calculated wrong due to the masked coordinates.

Delft3D-Flow places masks not only on the variables but in the coordinates where the domain is not defined.

Possible causes

  1. The x, y max min is changed inside structured.py (line 57) based on self.lon and self.lat shapes ignoring the mask. This overrides the setting of the extreme pixels inside the delft3d reader. Plus, it seems to switch the coordinate dimensions.

  2. Because opendrift is looking for a bounding box of a masked array, some of the bounding box values can be mapped into the closest NaN by scipy.ndmimage.map_coordinates.

Test files

The delft3d-flow reader can be found here. You may want to look at the standard variable mapping at the top of the module so that you are able to extract the correct variables from the netCDF files that you can find here for the unprojected wgs84 and here for the projected in epgsg:28992

knutfrode commented 3 weeks ago

You can define a dedicated pixelsize method in this specific reader,

def pixelsize(self):
    return 1

If I temorarily do this in structured.py, it is possibly to create a generic Reader for the unprojected file, with some mapping of standard_name:

from opendrift.readers.reader_netCDF_CF_generic import Reader
r = Reader('trim-f34_wgs84.nc', standard_name_mapping={'U1': 'x_sea_water_velocity', 'V1': 'y_sea_water_velocity'})
print(r)

===========================
Reader: trim-f34_wgs84.nc
Projection: 
  None
Coverage: [pixels]
  xmin: 0.000000   xmax: 21.000000   step: 1   numx: 22
  ymin: 0.000000   ymax: 14.000000   step: 1   numy: 15
  Corners (lon, lat):
    (  0.00,   0.00)  (  0.00,   0.00)
    (  0.00,   0.00)  (  0.00,   0.00)
Vertical levels [m]: 
  Not specified
Available time range:
  start: 1990-08-05 12:30:00   end: 1990-08-06 01:00:00   step: 0:15:00
    51 times (0 missing)
Variables:
  longitude
  latitude
  x_sea_water_velocity
  y_sea_water_velocity
  sea_water_speed - derived from ['x_sea_water_velocity', 'y_sea_water_velocity']
===========================

But it remains to deal with the lon- and lat- arrays containing 0 values where they should be masked. OpenDrift does presently not handle lon- lat arrays with masked values, but perhaps this could be implemented in a delft-reader, overriding existing methods in parent reader classes. Later, and ideally, these features/changes could be moved to the parent classes, removing the need for a dedicated Delft-reader.

lencart commented 3 weeks ago

Thanks for looking into it. Could you please point me to which parent class methods I need to override?

knutfrode commented 2 weeks ago

I had a look at this, and I am afraid that overriding parent methods might lead to unpredictable behaviour, as the "structured" reader class is based on assumption that we have 2D arrays of lon and lat (or 1d arrays for projected datasets).

With the unprojected grid file, self.lon and self.lat are 2D arrays where values are not masked, but are simply 0 over land. The only solution I can see here would be if the Delft-reader could extrapolate/fill these 0/masked-values based on the available lon/lat values, so that self.lon and self.lat do not contain any 0- or NaN-values when passed to the parent/super-classes. This might give some inaccuracy over land, but that should not be important as there is anyway no valid data.