cal-adapt / climakitae

A Python toolkit for retrieving, visualizing, and performing scientific analyses with data from the Cal-Adapt Analytics Engine.
https://climakitae.readthedocs.io
BSD 3-Clause "New" or "Revised" License
21 stars 2 forks source link

Improvements to get_data() function to enable pulling data without GUI #450

Closed nicolejkeeney closed 1 month ago

nicolejkeeney commented 1 month ago

Description of PR

Big improvements to the get_data() function to enable retrieving data manually using a warming levels approach. I also added an option to set the time slice. I changed some of the logic in the function as well to be better organized.

How to test

Try using the get_data() function to pull data using a warming levels approach. Use a bunch of different options; see what breaks, and if the error message for a bad input is helpful. PLEASE TELL ME WHAT YOUR FUNCTION INPUTS ARE IF YOU FIND AN ERROR I NEED TO FIX SO THAT I CAN REPRODUCE IT!!

For example:

get_data(
    variable = "Precipitation", 
    downscaling_method = "Dynamical", 
    resolution = "45 km", 
    timescale = "monthly",
    #time_slice = (1800,1900),
    #scenario = "Historical Climate",
    approach = "Warming lEvel",
    warming_level_window = 10,
    warming_level = [2.0, 3.0, 4.0],
    cached_area = "San Bernardino County"
)

See the notebook climakitae_direct_data_download.ipynb for more info on this function.

Let me know if the function documentation is confusing or insufficient in any way.

Summary of changes and related issue

Better error handling is added to give more information to the user if bad inputs are provided to the function. Allows users to retrieve data using the following new function arguments:

  1. approach
  2. time_slice
  3. warming_level
  4. warming_level_window
  5. warming_level_month

Relevant motivation and context

Part of a larger push to incorporate global warming levels into climakitae.

Let's figure out how to get this into our existing notebooks to make the code cleaner and also demonstrate this new functionality! :D

Type of change


Definition of Done Checklist

Practical

Conceptual

nicolejkeeney commented 1 month ago

WL integration step 3: retrieve using warming levels approach but bypass the GUI (i.e using methods in climakitae_direct_data_download notebook)

nicolejkeeney commented 1 month ago

Tested a bunch of settings and looking really solid @nicolejkeeney. A few things I noticed:

  • If warming_level is not passed as a list (e.g. warming_level = 1.5) we get the ubiquotious mismatch size error
  • If warming_level_month is not passed as a list though, I get this warning: Cell In [25], line 13 ) ^ SyntaxError: positional argument follows keyword argument
  • Since the options for warming_level are floats here, will need to coordinate with the CAVA dev team (CC: me + @claalmve ) since the cava_data function uses a string for WLs and if this functionality gets incorporated in / replaces data retrieval there, there will need to be some modifications on that end too.

Great work on a beefy task!

Regarding 1 & 2: Excellent bug finding as usual, those are definitely confusing errors that I will fix to be more intuitive to users.

Regarding 3: I personally think it makes more sense to have the numbers as floats rather than strings. Will work with you and Calvin to get that updated in CAVA.

nicolejkeeney commented 1 month ago

Fixed everything mentioned by reviewers. Merging into upstream branch (wl3).