eco4cast / unconf-2023

Brainstorming repo to propose and discuss unconference project ideas!
12 stars 0 forks source link

Develop module for grabbing S2S weather forecasts (e.g. NMME) analogous to GEFS down #4

Open mdietze opened 1 year ago

cboettig commented 1 year ago

This sounds cool! Can you drop in some links to canonical sources for the S2S forecasts? (i.e. is this the right thing? https://ftp.cpc.ncep.noaa.gov/NMME/prob/netcdf/ )

(Bonus points for sources that are part of NOAA's big data program, i.e. on commercial cloud provider? Do you think NOAA CSF would be of interest?

mdietze commented 1 year ago

@cboettig yep, that's the right product. FYI here's @juniperlsimonis and @ethanwhite code for downloading NMME https://github.com/weecology/portalcasting/blob/2b3451a9da3a91bbf63157889d1e16dad3b58259/R/download.R

Yes, I think folks would also be interested in NOAA CSF, but I'd say its a slightly lower priority because it only has 4 ensemble members (I think NMME is n=7).

ECMWF also has a seasonal forecast product but I've never been able to figure out how to get access to it https://www.ecmwf.int/en/forecasts/datasets/set-v

NCAR also has a decadal ensemble forecast. https://www.cesm.ucar.edu/community-projects/dple What's publicly available seems to only go to 2015 but in her talk in EFI's AGU2022 session Nikki Lovenduski seemed to be saying that there's a new product that's being updated. Haven't pinged her yet to get more details but would be happy to if there's interest.

ashander commented 1 year ago

Not a lot of uptake, eh but I'm really interested in working on S2S forecasts - whether at the Unconf or elsewhere!

emmamendelsohn commented 1 year ago

We've been able to access ECMWF seasonal forecast data through the Copernicus API: https://cds.climate.copernicus.eu/cdsapp#!/dataset/seasonal-monthly-single-levels?tab=form

Hoping to publish the repo with the download and processing routine by next week, just needs some cleanup and documentation. I'd be glad to work on this type of issue at the unconf!

cboettig commented 1 year ago

ECMWF is also available from azure cloud via microsoft's STAC API: https://planetarycomputer.microsoft.com/dataset/ecmwf-forecast , and also in an AWS bucket I think? https://registry.opendata.aws/ecmwf-forecasts/

We actually added NOAA CFS to the existing gefs4cast and computed those snapshots already. The worst part about it is the somewhat esoteric and poorly documented rules for the length of the ensembles (for the 6-hourly cycle runs, the first ensemble is up to 7 months but always ending on the last day of the month, the other 3 are 4 months, also ending on the last day of a month). The standard trick at NOAA if you want more ensembles is just to use those from nearby cycles -- when we're talking forecasts with a 7 month horizon, having an ensemble member that technically starts 6 hrs earlier or later than another one is essentially the same thing as just starting another perturbed ensemble member at the same time. So you easily have 16 ensembles every 24 hours.

Not sure if it would be interesting, but Microsoft research claims to have an AI model that out-performs CFS which I think we could get access to as well. (though personally I still need to be convinced that it is better in strictly proper probabilistic skill score terms and not underestimating low-probability fluctuations...)

We could scan the AWS - NOAA Open Dataset Program collection too -- once we can point directly to spatial data assets (grib2, ncdf, etc) from a high-bandwidth provider, I think the strategy we're using in gefs4cast package (gdalcubes) should be practical. I'm probably biased but I do think this would make a good unconf project where a team could get a solid prototype operational and maybe pick up some general purpose skills for working with these large collections efficiently....