google / weather-tools

Tools to make weather data accessible and useful.
https://weather-tools.readthedocs.io/
Apache License 2.0
208 stars 39 forks source link

Enhanced support in weather-dl for downloading data across month ranges spanning multiple years. #372

Open dabhicusp opened 1 year ago

dabhicusp commented 1 year ago

I have a suggestion for an enhancement to the weather-dl tool. Currently, when downloading data for a month range that spans multiple years, the process requires manual splitting of the month range into separate configurations and executing multiple jobs for each part. This process can be tedious and time-consuming.

For instance, if we wish to download data from April 1994 to August 2012, and the target path follows this format: (gs://xyz/{year}/{year}{month:02d}_hres_var.grb2), we need to split the dates into three parts as follows:

1st part : year=1994 month=04/to/12 day=all

For the 2nd part: year=1995/to/2011 month=01/to/12 day=all

And for the last part: year=2012 month=01/to/08 day=all

I propose an enhancement in the weather-dl parser by introducing a new keyword, such as month_year, in the configuration. The value of this keyword would look like 08-2012 or 01-2022. Additionally, we can add a script to the parser of weather-dl that can correctly interpret this keyword.

P.S.: A similar use-case can be found in the Arco-era5 dataset. Please refer to any file containing sfc in the filename for more context.

alxmrs commented 1 year ago

I definitely see the value in this language feature, I think it would reduce a lot of boilerplate. Can we explore a few options for what the keyword or syntax to express this? How about you propose a few options, and we'll decide together on the best approach.