dklinges9 / mcera5

mcera5
12 stars 9 forks source link

little to-do list #4

Closed everydayduffy closed 1 year ago

everydayduffy commented 3 years ago

Improve description of DTR correction value in point_nc_to_df. Rename point_nc_to_df & point_nc_to_df_precip to extract_weather & extract_precip Remove time as inputs to processing functions and just deal with everything on a yearly basis? This would mean changes in the request builder, forcing everything to yearly.

dklinges9 commented 3 years ago

Hi James @everydayduffy,

Given that I'm watching this repository I was notified of this issue....I thought I should mention that as someone who has been actively using your functions regularly at the moment (and they've been great), I'd exhert some caution before removing time as inputs to the functions (I presume you mean point_nc_to_df and point_nc_to_df_precip). The reason being: I've spent a lot of time recently working on re-chunking and re-ordering ERA5 netCDF files to have optimally fast read/write speeds (in case the idea of chunking is new, here's a great primer blog post series and also a useful stackexchange from David LeBauer). As a result, for different use cases/spatial extents, it may be faster for users to work with daily, monthly, yearly, or interannual netCDFs, but requiring a yearly basis (although lending itself to microclima and certainly NicheMapR use) may constrain users to a netCDF structure that is slow for them.

In the same vein, I think the painfully slow request/query time for a timeseries at a single point from the Climate Data Store is in large part to the way their netCDFs are stored (lat/long first, then time dimension last)– but it's much faster to query a global extent for a shorter timeseries. I therefore found it faster to query the Climate Data Store myself (I did monthly, global chunks for my purposes), and then plug into point_nc_to_df afterwards (I've actually been meaning to reach out and offer my querying python code as possibly of some use for the request function(s), but I don't think it's fully helpful yet).

For my applications, I'm using heavily modified versions of your functions that I have locally rather than the out-of-the-box versions, so removing time inputs wouldn't effect my use cases per se. Just providing my 2 cents on what might be helpful for other users (and possibly me at some future point). Happy to chat further.

dklinges9 commented 1 year ago

These initial to-do's have been addressed and/or less relevant, so closing this issue.