NRLMMD-GEOIPS / geoips

Main Geolocated Information Processing System code base with basic functionality enabled.
https://nrlmmd-geoips.github.io/geoips/
Other
14 stars 11 forks source link

Consider exposing which variables Readers can ingest and create generic utilities that can be used by all readers #475

Open evrose54 opened 5 months ago

evrose54 commented 5 months ago

Requested Update

Description

Readers in GeoIPS are fundamental, yet they lack the ability to expose pertinent information to the user. They also implement code which performs conversions on certain data, that is very similar from one reader to another (ie. Radiance to Brightness Temperature, etc.). Both of these points are elements of readers that could be improved or consolidated.

We should consider adding variables able to be read via a reader to the appropriate reader plugin in the plugin registry. While we'd need to modify each reader to ensure they have a list of variable names that are able to be read by the reader, it would be information that we could expose via the plugin registry to a user with the CLI. This would be very helpful. I envision a reader's entry in the plugin registry would look like this:

    windsat_remss_winds_netcdf:
      docstring: Read derived surface winds from REMSS WINDSAT netcdf data.
      family: standard
      interface: readers
      package: geoips
      plugin_type: module_based
      relpath: plugins/modules/readers/windsat_remss_winds_netcdf.py
      signature: (fnames, metadata_only=False, chans=None, area_def=None, self_register=False)
      variables: [var_name1, var_name2, var_name3, ...]

Once we have information of what variables can be ingested by a reader, we could create generic utility functions which could perform conversions on data (Radiance to Brightness Temperature, ...), generic read functions (Xarray.open_dataset() --> retrieve certain variables, ...), and other utilities that we think could be refactored and used by any reader.

While the generic utilities would need a great amount of consideration, as conventions between variable names, filetypes, etc. has not been enforced via satellite data producers, it would definitely make the development of readers a much easier process.

Background and Motivation

This issue stemmed from a conversation @jsolbrig and I had when reviewing a GREMLIN reader I had produced for the GeoIPS GLM package.

Alternative Solutions

Leave the code as is and rewrite it where necessary (not optimal or a good coding standard).

Environment

Code to demonstrate issue

See geoips/geoips/plugins/modules/readers/abi_netcdf.py for an example of variables that we could expose. Look in that file for this line (1128 - 1130) to see an example of a conversion function that we could add to generic utilities. This function converts radiance to brightness temperature for abi data.

        data["BT"][~bad_data_mask] = ne.evaluate(
            "(fk2 / log(fk1 / rad_data + 1) - bc1) / bc2"
        )