ARM-DOE / pyart

The Python-ARM Radar Toolkit. A data model driven interactive toolkit for working with weather radar data.
https://arm-doe.github.io/pyart/
Other
516 stars 268 forks source link

Replace the Grid class layout #285

Closed jjhelmus closed 8 years ago

jjhelmus commented 9 years ago

The current layout of the Grid class is historical and ad-hoc, it does not follow a formal standard. A more formal layout of the class should be used which conforms to the CF conventions. The section of the CF convention which are relavent are 5.6 Horizontal Coordinate Reference Systems, Grid Mappings, and Projections and Appendix F. Grid Mappins.

A specification for Gridspec standard has been put forward as a possible method for describing gridding data within the CF conventions. This standard is aimed at storing model data but may be adaptable for storing gridding radar data.

There is also the UGRID Conventions for specifying unstructured grids.

gamaanderson commented 9 years ago

I've just read that Gridspec documentation, and it seemed for me overly complicated for our simple use. Also it date from 2007 and does not seem to have gained much momentum since them. I don't see how the class Gridspec in matplotlib relate to that, may be is just a coincidence.

CF, on the other hand, is more promising. To make it more friendly, one could add functions that interpret that data and return more useful python variable, for instance a Proj instance from pyproj.

jjhelmus commented 9 years ago

Reading through GridSpec it does seem more complicated than we need. There seems to be a number example of representing uniform latitude/longitude grids in NetCDF but I have not found one which covers uniform Cartesian grids which is what Py-ART is producting. Although this does correspond to a lat/lon is is not uniformly spaced. I'll check into what is required for a Proj instance as see if any of the modelers I work with know of any other Grid standards.

gamaanderson commented 9 years ago

Have you seen this sites: http://www.unidata.ucar.edu/software/thredds/current/netcdf-java/reference/StandardCoordinateTransforms.html http://www.unidata.ucar.edu/software/thredds/current/netcdf-java/tutorial/CoordinateAttributes.html

With this convention I was, today, finally able to plot Cartesian data in THREDDS, with implies it was automatically recognized by NetCDF JAVA. This was a task that other here and myself have tried a lot and in the end give up as impossible. I did however need to set several attributes, here is the netcdf header (using points to indent)

netcdf test_proj {
dimensions:
    y = 400 ;
    x = 600 ;
    time = 2 ;
    depth_below_surface = 4 ;
variables:
    float Soil_temperature(time, depth_below_surface, y, x) ;
        Soil_temperature:units = "K" ;
        Soil_temperature:_CoordinateSystems = "ProjectionCoordinateSystem" ;
    double y(y) ;
        y:units = "km" ;
        y:long_name = "y coordinate of projection" ;
        y:_CoordinateAxisType = "GeoY" ;
    double x(x) ;
        x:units = "km" ;
        x:long_name = "x coordinate of projection" ;
        x:_CoordinateAxisType = "GeoX" ;
    int time(time) ;
        time:long_name = "forecast time" ;
        time:units = "hours since 2003-09-03T00:00:00Z" ;
        time:_CoordinateAxisType = "Time" ;
    double depth_below_surface(depth_below_surface) ;
        depth_below_surface:long_name = "Depth below land surface" ;
        depth_below_surface:units = "m" ;
        depth_below_surface:_CoordinateAxisType = "Height" ;
        depth_below_surface:_CoordinateZisPositive = "down" ;
    byte ProjectionCoordinateSystem ;
        ProjectionCoordinateSystem:_CoordinateAxes = "time depth_below_surface y x" ;
        ProjectionCoordinateSystem:_CoordinateTransformType = "Projection" ;
        ProjectionCoordinateSystem:grid_mapping_name = "azimuthal_equidistant" ;
        ProjectionCoordinateSystem:longitude_of_projection_origin = -55. ;
        ProjectionCoordinateSystem:latitude_of_projection_origin = -25. ;
        ProjectionCoordinateSystem:semi_major_axis = 6378137. ;
        ProjectionCoordinateSystem:inverse_flattening = 298.257223563 ;
        ProjectionCoordinateSystem:longitude_of_prime_meridian = 0. ;
        ProjectionCoordinateSystem:false_easting = 0. ;
        ProjectionCoordinateSystem:false_northing = 0. ;
}
jjhelmus commented 9 years ago

Great find! Unidata has some very nice products so using a format they support sounds like a good idea. If the write_grid function can produce NetCDF files that work with THREDDS and their other tools I think that would be ideal.

It seems that a projection is required for these to work which is understandable but I think should be used with caution. Py-ARTs grid object is a 3D Cartesian grid centered at a particular latitude, longitude, there is no explicit projection assuming in the gridding process. To derive latitudes and longitudes for each point a projection must be used but the results will differ if different projections are used. With this the case care must be taken to allow the user to specify the projection desired and include this projection once specified in the object.

In addition Py-ART is using basemap for x,y -> lat, lon projection but this is an optional dependency so care must be taken to provide fallback procedures when this package is not installed.

gamaanderson commented 9 years ago

I really can not understand how is it possible to have no projection, there is always assumptions about it, even if one is unaware of those.

From what I understand radar_coords_to_cart as it does not disrupt the grid, it is using the best representation of an area around a point, that is azimuthal_equidistant projection. This is particularly good, since rays in this projections are geodesic, and therefore the path followed by a electromagnetic pulse. I'm not sure if Doviak and Zrnic explicitly say this, but following the formula deduction (which I have done several times) one is assuming this.

Of course the user shall have the ability of overwrite the standard "azimuthal_equidistant", but by doing this it will decrease the spacial precision.

jjhelmus commented 9 years ago

I did not think that the radar_coords_to_cart function assumed a map projection, rather my understanding was that it converts an azimuth, elevation, range coordinate system into a x, y, z, Cartesian system using typical propagation of microwaves in the atmosphere. The function does assume a spherical 4/3 earth radius earth so there is an underlying map projection assumption that I was not aware of. Sorry for misrepresenting this, I was incorrect.

After reading up on the Azimuthal equidistant projection this does seem to be the projection being used in the radar_coords_to_cart derivation, thanks for pointing this out @gamaanderson!

I still wonder how best to obtain latitudes and longitudes for all grid points from the grid center latitude and longitude and the Cartesian distances. It appears as if the azimuthal equidistant projection provides analytic equations which map from azimith/elevation -> x, y, -> lat, lon which could be used but other map projection from pyproj.Proj could also be used. I believe this is exactly what PR #264 implements, a function to calculate latitude and longitudes from a grid object either using an internal implementation of the azimuthal equidistant projection or a Proj projection, can you confirm this @gamaanderson? If so, I'll check it over and work on merging it asap.

scollis commented 9 years ago

Exactly.. Just x,y and height displacement from a spherical surface assuming standard atmosphere (R_e * 4/3)

The paradigm has always been to call the grid mapper with these coordinates and then prescribe lat lons later..

One could always take a set of lat lons, back out X,Y and call the KD-Tree with that

On 5/5/15 9:34 AM, Jonathan J. Helmus wrote:

I did not think that the |radar_coords_to_cart| function assumed a map projection, rather my understanding was that it converts an azimuth, elevation, range coordinate system into a x, y, z, Cartesian system using typical propagation of microwaves in the atmosphere. The function does assume a spherical 4/3 earth radius earth so there is an underlying map projection assumption that I was not aware of. Sorry for misrepresenting this, I was incorrect.

After reading up on the Azimuthal equidistant projection http://en.wikipedia.org/wiki/Azimuthal_equidistant_projection this does seem to be the projection being used in the |radar_coords_to_cart| derivation, thanks for pointing this out @gamaanderson https://github.com/gamaanderson!

I still wonder how best to obtain latitudes and longitudes for all grid points from the grid center latitude and longitude and the Cartesian distances. It appears as if the azimuthal equidistant projection provides analytic equations which map from azimith/elevation -> x, y, -> lat, lon which could be used but other map projection from |pyproj.Proj| could also be used. I believe this is exactly what PR #264 https://github.com/ARM-DOE/pyart/pull/264 implements, a function to calculate latitude and longitudes from a grid object either using an internal implementation of the azimuthal equidistant projection or a Proj projection, can you confirm this @gamaanderson https://github.com/gamaanderson? If so, I'll check it over and work on merging it asap.

— Reply to this email directly or view it on GitHub https://github.com/ARM-DOE/pyart/issues/285#issuecomment-99096849.

I ride for Parkinsons research http://www.events.org/sponsorship.aspx?id=51573

gamaanderson commented 9 years ago

Yes @jjhelmus, you are correct about PR #264. I just update some alterations I had (PEP8, docstring etc...). In theory it could be merged, but I believe is better to put as a method of Grid. Also chose a better name.

jjhelmus commented 9 years ago

I went ahead and merged #264 so that the code is available in master. When thr Grid class is rewritten this function should renamed and added as a method of the class. Also the latitude and longitude metadata should be added to the default_config.py file and be extracted from there.

kirknorth commented 9 years ago

Any further development or comments about this? I'm just starting to work with WRF forward modeled radar observations, the vertical dimension of the simulated radar grids being described by geopotential height _Zg, so for example, the z_disp axis of the Grid is no longer properly defined as a 1-D array since _Zg is a function of (x, y, z).

scollis commented 9 years ago

Hey Kirk, I am about to convene a group of interested folks on this.. I am going to include you as interested.. I will see Pavlos in DC tomorrow so will touch base with him as well

On 10/20/15 3:28 PM, Kirk North wrote:

Any further updates or comments about this? I'm just starting to work with WRF forward modeled radar observations, the vertical dimensions of the simulated radar grids being described by geopotential height /Z_g/, so for example, the _zdisp axis of the |Grid| is no longer properly defined since /Z_g/ is a function of (/x/, /y/).

— Reply to this email directly or view it on GitHub https://github.com/ARM-DOE/pyart/issues/285#issuecomment-149692729.

jjhelmus commented 8 years ago

Here is a proposal for a new layout of the Grid class. Please provide comments!

First an overview of the layout of the class currently:

Legacy Grid class layout

Attributes

The Grid class is intended to store data from a rectilinear grid in Cartesian space. Initially the grid will have uniform spacing along a given x,y, or z dimension although this will not be a requirement of the class, non-uniform spacing along a dimension can be accommodated in the layout.

New Grid class layout

All 3D arrays will have dimension ordered as z, y, x.

Attributes

Methods

A few notes about the types of Grids supported by this class:

Plotting the Grid object would read from the the point_latitude, longitude, altitude or point_x, y, z arrays for the position of the grid points.

Grids which evolve over time (four dimensional grids) are not covered by this class. A GridCollection class is envisioned for this purpose when needed.

jjhelmus commented 8 years ago

While implements this new Grid class I've come upon a few additional attribute which I think will be helpful:

Additionally I've renamed some of the methods from the original proposal:

jjhelmus commented 8 years ago

Also I'm proposing the following depreciate plan for the legacy Grid class is as follows:

Please comment if any of these cause concern.

gamaanderson commented 8 years ago

I approve nx, ny ,nz. I think the netcdf idea of naming the dimensions is really good verbose, even if redundant.

I kind of liked the axes dict, it separated the geometry describing variable from the other accessory attributes. But if it must go, it is also not that bad.

tjlang commented 8 years ago

I've used the axes dict quite a bit, but I can always change calls in that code if needed.

jjhelmus commented 8 years ago

I've finished PR #434 which implements the new layout for the Grid class and the NetCDF files that Py-ART uses to store these objects. I made a long write-up on the changes in that PR. If anyone wants to review the PR I would welcome and greatly appreciate it.

@gamaanderson and @tjlang and other who may need to change their code to make use of the new layout should know about the Grid.from_legacy_parameter class method which uses the older "legacy" Grid parameters to create a Grid object. Additionally, the read_legacy_grid method can be used to read in NetCDF files which use the legacy variable names and the convert_legacy_grid script will convert these files to the new format. No support is included for creating NetCDF files with the legacy variable names.

Hopefully this will make the transition easier, although I know it will still require some changes. I do think they new layout is more informative and will offer flexibility down the road.

Seeing everything implemented, I'm in favor of moving the depreciation schedule of the axes attribute and the legacy grid functions back a bit. They are not conflicting with any of the new layout so I think including them in the next two major release(1.6.0 and 1.7.0) will not be a problem.

It would be possible to never remove these legacy Grid functions and the axes attribute but I do not like this option. Having multiple methods for access the same data and multiple file formats is not a good long term solution. To Quote from the Zen of Python:

There should be one-- and preferably only one --obvious way to do it.
gamaanderson commented 8 years ago

I will try to review this PR as soon as I can. I'm particularly interested to see if we can get the netcdf files to be recognized in netcdf java.

gamaanderson commented 8 years ago

I have give it a look and it is good, despite from a few points I commented. I do have some other points however.

  1. The good news is that it worked in artview without any alteration, so the legacy is working. W only need to add support for the read_legacy_grid function. (and also change to the new format before the old one is removed)
  2. The routines for reading and writing mdv need some updating to handle the projection information, now that we have that information. For the other io function I don't know, but it is something to be taken in consideration.
  3. I would like to have some kind of access to the Proj instance that represents the grid projection (assuming it is using PyProj). Also, if I want to change the projection, should I manipulate the grid.projection dict directly?
  4. Unfortunately, as I'm no longer in Simepar, I can't test it, but I'm pretty sure the written netcdf is not automatically geolocalisated in netcdf java. One would need some "special" attributes. I think the question here is: Do you want the ARM netcdf grid format to be automatically geolocalisated in netcdf java? if yes we should work on this before launching the new format, if no we can handle netcdf_java_files as a separated format and work on it in an other PR. I will get in touch with @CesarBeneti to make some tests, so I know exactly what we need in the files.
  5. When writing netcdf shouldn’t we add a "conventions" attribute just like CfRadial does? this would help to recognize this convention in opposition to other ones.

I think this is all, nice work as always.

jjhelmus commented 8 years ago

@gamaanderson Thanks for review the PR, always good to have a second set of eyes on the code before making large changes.

jjhelmus commented 8 years ago

The good news is that it worked in artview without any alteration, so the legacy is working. W only need to add support for the read_legacy_grid function. (and also change to the new format before the old one is removed)

Good to hear! Hopefully the transition to the new format will not be too difficult. Ping me if you need any help with this.

The routines for reading and writing mdv need some updating to handle the projection information, now that we have that information. For the other io function I don't know, but it is something to be taken in consideration.

Agreed, read/write_grid_mdv will need to be updated with the projection information. I do not have enough experience with MDV grid files to feel comfortable making the necessary changes. I will open an issue for this work once this PR is merged so that other are aware of the need.

I do not believe there are any other io functions besides the read/write_grid and read_legacy_grid function which operate on grids.

I would like to have some kind of access to the Proj instance that represents the grid projection (assuming it is using PyProj). Also, if I want to change the projection, should I manipulate the grid.projection dict directly?

This would be nice and can be accomplished with a class property which is read only (cannot be set). The only question is what should happen when the projection is set to the Py-ART implementation of azimuth equidistant? Raise an exception, return a Proj like class, something else? If the Proj like clas, how much of the Proj interface needs to be replicated.

As for modifying the projection, this should be done by changing the grid.projection dictionary directly. The read_grid function does this to set the projection from what is recorded in the file. Would having an optional projection parameter added to the call signature be helpful?

Unfortunately, as I'm no longer in Simepar, I can't test it, but I'm pretty sure the written netcdf is not automatically geolocalisated in netcdf java. One would need some "special" attributes. I think the question here is: Do you want the ARM netcdf grid format to be automatically geolocalisated in netcdf java? if yes we should work on this before launching the new format, if no we can handle netcdf_java_files as a separated format and work on it in an other PR. I will get in touch with @CesarBeneti to make some tests, so I know exactly what we need in the files.

I'll run a few tests on these new files here at Argonne. I suspect that a different format will be required to get the geolocalisation working in the Java tools. Depending on how many changes are needed to the format a separate format may make sense. Having a simple format to serialized the Grid class to and from files may need to be separate from a format which interacts nicely with other tools. That said if the changes are minor I'd like to limit the number of grid formats, so some testing is warranted.

When writing netcdf shouldn’t we add a "conventions" attribute just like CfRadial does? this would help to recognize this convention in opposition to other ones.

Adding some kind of global attribute to identify these files is a good idea. Let me think about the best method to do this.

gamaanderson commented 8 years ago

I agree with all comments, the only point I believe need an answer is: yes, I do think an optional projection parameter is a good idea.

jjhelmus commented 8 years ago

I had to look at what is required for the Java NetCDF tools to interpret the NetCDF files from the write_grid functions as a gridded coordinate system and this is possible by adding one variable to the file that defines the coordinate system and projection. Big thanks to @gamaanderson for the links to the thredds documentation on the topic.

For example the following is correctly interpreted as containing a single time, height, geo_x, geo_y grid centered around northern Illinois:

netcdf test {
dimensions:
    x = 464 ;
    y = 464 ;
    z = 3 ;
    time = 1 ;
variables:
    float x(x) ;
        x:units = "km" ;
    float y(y) ;
        y:units = "km" ;
    float z(z) ;
        z:units = "km" ;
        z:positive = "up" ;
    float time(time) ;
        time:units = "seconds since 1970-01-01T00:00:00Z" ;
    float FakeReflectivity(time, z, y, x) ;
    char ProjectionCoordinateSystem ;
        ProjectionCoordinateSystem:grid_mapping_name = "flat_earth" ;
        ProjectionCoordinateSystem:latitude_of_projection_origin = 39.5388 ;
        ProjectionCoordinateSystem:longitude_of_projection_origin = -90.84 ;
        ProjectionCoordinateSystem:_CoordinateTransformType = "Projection" ;
        ProjectionCoordinateSystem:_CoordinateAxes = "x y z time" ;
        ProjectionCoordinateSystem:_CoordinateAxesTypes = "GeoX GeoY Height Time" ;
}

A variable similar to ProjectionCoordinateSystem will need to be added to the NetCDF files created by the write_grid function. I think that this variable should be independent of the variable which contains the Grid.projection attribute. Also the write_grid function should offer a flag which will skip writing this variable for cases when it is not needed or cannot be determine.

The CDM model which Java-netCDF uses only knows about a limited number of projections, so not all pyproj projections can be represented in the file. I'm planning on writing a function that will generate an appropriate CDM ProjectionCoordinateSystem variable given a Grid.projection dictionary. When there is not a mapping better the Grid.projection and a CDM projection, this function will inform the caller (raise an exception, return None, etc). In the write_grid function this will trigger a warnings and the variable will not be included in the output. User will also be able to pass their own version of this function if desired.

Finally, I think the regular_x, regular_y, and regular_z should be renamed to x, y, z. There is no requirement that the x, y, and z coordinate need have regular spacing, only that the final grid is rectilinear using x, y, and z and the orthogonal axes.

jjhelmus commented 8 years ago

I changed the dimensions in the Grid class and files to be named x, y, and z, added access to the PyProj Proj object by way of the Grid.projection_proj property, added a global Conventions attribute to the output grid files, and added a ProjectionCoordinateSystem variable to the file which should allow them to be used with tools using the Java NetCDF library (THREDDS, toolsUI, etc)

gamaanderson commented 8 years ago

Good. I've sent a file for testing in Simepar. I'm not sure if the lack of a _CoordinateSystems attribute in the field is a problem.

jjhelmus commented 8 years ago

It shouldn't according to the "Implicit Coordinate Systems" section at http://www.unidata.ucar.edu/software/thredds/current/netcdf-java/tutorial/CoordinateAttributes.html and it did not cause an issue when the tests I ran. Still having others test to see if the files work with their software would be helpful.

gamaanderson commented 8 years ago

The File worked in Simepar. I believe there are still some things to be done (graphic, mdv etc.), but those can be done in other PR's.