Enable users to view and retrieve derived variables/indices without using the DataInterface GUI. This wasn't previously an option.
Summary of changes and related issue
Derived variables/indices are now available for viewing using the existing function get_data_options()
Derived variables/indices are now available for retrieval using the existing function get_data()
Added a new column "dependencies" to the variable_descriptions.csv file, which is used to determine variable dependencies for the derived variables/indices. This helped me determine the correct subset/combo of options for each variable to return to the user.
Wrote some unit tests (hell yeah!!)
Relevant motivation and context
When I wrote the two functions get_data_options() and get_data(), I didn't realize they didn't allow you to view/retrieve derived variables; you can only access the catalog variables. Calvin mentioned he needed to get the derived variables using the get_data() function, so I took the opportunity to improve the function.
NOTE: This required a shocking amount of code... the logic in climakitae for accessing the correct subset/combination of options for each variable is really complex.
These are the variable ("display") names of all the derived variables:
Relative humidity [hourly only-- daily/monthly is native in the catalog]
Wind speed at 10m
Wind direction at 10m
Dew point temperature
Specific humidity at 2m
Dew point temperature
Fosberg fire weather index
Effective Temperature
NOAA Heat Index
How to test
1) Run through the notebook climakitae_direct_data_download.ipynb to ensure that it still works correctly, since I modified the functions used in that notebook
2) Try retrieving some derived variables/indices. For example:
# See all the data combinations (i.e. different resolution, scenario, etc.) for the Fosberg fire weather index
get_data_options(variable = "Fosberg fire weather index")
# Retrieve data for the Fosberg fire weather index
get_data(
variable = "Fosberg fire weather index",
scenario = "Historical Climate",
timescale = "hourly",
resolution = "9 km",
downscaling_method = "Dynamical",
time_slice = (1990,1991)
)
# See all the data combinations (i.e. different resolution, scenario, etc.) for Dew point temperature
get_data_options(variable = "Dew point temperature")
# Retrieve data for Dew point temperature
get_data(
variable = "Dew point temperature",
scenario = ["Historical Climate","SSP 2-4.5 -- Middle of the Road"],
timescale = "monthly",
resolution = "45 km",
downscaling_method = "Dynamical",
time_slice = (1990,2035)
)
Type of change
[ ] Bug fix (non-breaking change which fixes an issue)
[x] New feature (non-breaking change which adds functionality)
[ ] Breaking change (fix or feature that would cause existing functionality to not work as expected)
[ ] This change requires a documentation update
Definition of Done Checklist
Practical
[x] 80% unit test coverage
[x] Documentation
[x] All functions/adjusted functions documented in the readthedocs.
[x] Documentation is pushed
[x] Complex code commented
[x] Naming conventions followed
[x] Helper functions hidden with _ before the name
[x] Context of function is clearly provided
[x] Intent of function is provided
[x] How to test, so that it is not siloed on scientists and anyone can review
[x] Appropriate manual testing was completed
[x] Any notebooks known to utilize the affected functions are still working
[x] Linting completed and resolved
Conceptual
[x] Doesn't replicate existing functionality
[x] Aligns with general coding standard of existing functions
[x] Matches desired functinonality from users/scientists
Description of PR
Enable users to view and retrieve derived variables/indices without using the DataInterface GUI. This wasn't previously an option.
Summary of changes and related issue
get_data_options()
get_data()
Relevant motivation and context
When I wrote the two functions
get_data_options()
andget_data()
, I didn't realize they didn't allow you to view/retrieve derived variables; you can only access the catalog variables. Calvin mentioned he needed to get the derived variables using theget_data()
function, so I took the opportunity to improve the function.NOTE: This required a shocking amount of code... the logic in climakitae for accessing the correct subset/combination of options for each variable is really complex.
These are the variable ("display") names of all the derived variables:
How to test
1) Run through the notebook
climakitae_direct_data_download.ipynb
to ensure that it still works correctly, since I modified the functions used in that notebook 2) Try retrieving some derived variables/indices. For example:Type of change
Definition of Done Checklist
Practical
_
before the nameConceptual