icesat2py / icepyx

Python tools for obtaining and working with ICESat-2 data
https://icepyx.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
203 stars 101 forks source link

icepyx required variables #467

Open rwegener2 opened 9 months ago

rwegener2 commented 9 months ago

Summary

This is a write up about the presence of required variables in IS2 products. We find that all supported products except ATL14, ATL15, and ATL23 have the required variables. This difference can be ignored when ordering granules, but needs to be accounted for when reading.

Presence of Required Variables

The complete list of variables required by icepyx is:

[
    "sc_orient",
    "sc_orient_time",
    "atlas_sdp_gps_epoch",
    "cycle_number",
    "rgt",
    "data_start_utc",
    "data_end_utc",
    "granule_start_utc",
    "granule_end_utc",
    "start_delta_time",
    "end_delta_time",
]

After inspecting the data products which icepyx supports (list from is2ref._validate_product) we find that most of the products have all these variables.

Screenshot 2023-11-02 at 2 01 14 PM

We see that ATL14, ATL15, and ATL23 are the products which do not have any of these variables. ATL11 is missing two, but this is already accounted for in the icepyx code base.

Accounting for this difference in the code base

It seems that when ordering granules extraneous variables are ignored. This can be see with the following code, which runs without error:

import icepyx as ipx

short_name = 'ATL14'
spatial_extent = [-55, 68, -48, 71]
date_range = ['2020-02-20','2020-03-28']
region = ipx.Query(short_name, spatial_extent, date_range)
region.order_vars.avail()  # confirms that required variables are not present
region.order_vars.append(var_list=['h'])
region.order_granules(Coverage=region.order_vars.wanted)
path = './data/ATL14'
region.download_granules(path)

Trying to read the downloaded file, however, results in a ValueError:


path = './data/ATL14'
reader = ipx.Read(path + '/ATL14_GL_0318_100m_003_01_HEGOUT.nc')
reader.vars.append(var_list='h')
reader.load()

I suggest, then, that in PR #451 we:

  1. Query Leave a developer comment noting that unnecessary variables will be ignored by the NSIDC API
  2. Read Add an if statement to the load method. Since the list of products (14, 15, 23) is not the same as the products in build_single_file_dataset I don't see a reason to combine the logic.

Update: This was quite simple, so it has been pushed to the current version of the PR. We can of course change it!