openclimatefix / nwp

Tools for downloading and processing numerical weather predictions
MIT License
9 stars 3 forks source link

[META] Comparison of NWP Providers #31

Open jacobbieker opened 1 year ago

jacobbieker commented 1 year ago

This issue collates different papers and such on different NWP forecasts, ideally related to solar forecasting, and primarily with the goal of comparing ECMWF and ICON forecasts. I'll also be adding other, relevant papers as I go through them that might help in deciding which NWP models to use.

Context

We are trying to determine which NWP to use next, either as an additional one to MetOffice, or for areas where MetOffice UKV does not. The two primarry contenders are ECMWF, as its generally regarded as the best forecast system, and ICON, which has the advantage of being free, we have an archive already back to 2020, and its higher spatial and temporal resolution compared to GFS.

jacobbieker commented 1 year ago

https://research.asrc.albany.edu/people/faculty/perez/2013/forecast-se.pdf compares GFS-based WRF model to ECMWF, with ECMWF being the winner for single forecast, but the best results from averaging the outputs from both models

jacobbieker commented 1 year ago

https://journals.ametsoc.org/view/journals/bams/103/9/BAMS-D-21-0234.1.xml This compares quite a few operational forecasts, all intialized with the ECMWF data to reduce differences in that. Overall, the paper says ICON and IFS are the most similar forecasts globally, as ICON and IFS share a lot of the same parameterization schemes.

So far:

Average Root-mean-square-dfference between all pairs of models for 3-day forecasts and over the whole globe. full-BAMS-D-21-0234 1-t4

Same data, for single temperature in northern hemisphere: full-BAMS-D-21-0234 1-f9

jacobbieker commented 1 year ago

https://www.mdpi.com/2073-4433/10/9/503/htm

This paper looks at a very high resolution ICON model over Western Africa and compares it to IFS and ICON-global, especially on rainfall and wind speed. Most of the comparison is between the hi-res model and IFS, but they have some comparison of ICON-global and IFS, primarily:

atmosphere-10-00503-g006-550 image

Essentially, b, d, and f compare ICON-global and IFS RMSD for the wind fields.

jacobbieker commented 1 year ago

https://mdpi-res.com/d_attachment/remotesensing/remotesensing-12-03672/article_deploy/remotesensing-12-03672-v2.pdf?version=1604985994 has to do with blending ICON and IFS for solar forecasting, and weighting them. The best result is from weighting them along with an optical flow model working on satellite cloud imagery. This is doing solar forecasts over central Europe in the 1-5 hour range.

Notes:


Just to keep these comparisons up here, adding some more papers here:

Focused on fog forecasting: https://web.archive.org/web/20211015043816id_/https://acp.copernicus.org/preprints/acp-2021-832/acp-2021-832.pdf

Comparison of various NWP models, for things related to fog forecasting: image

This one is on rainfall in southern Western Africa https://rmets.onlinelibrary.wiley.com/doi/pdf/10.1002/qj.3729

Model with overall best agreement with observations is ECMWF IFS, although ICON also gives similar good results. Both outperform UKMO, and COSMO models. IMERG is the https://rmets.onlinelibrary.wiley.com/doi/pdf/10.1002/qj.3729

It also has a section on low cloud biases in various models over the area of interest: image

IFS showed realistic cloud patterns, but underestimate cloud cover by 17%. In overall bias, best results generated by ICON and COSMO, with slight overestimations of 2-3%. ICON showed a large west-east gradient in cloudiness compared to observations, and overestimation of clouds along the coast.

It also looks at radiation forecasts. IFS had a clear imprint of the low-level cloud distribution, but with too low cloud coverage, and too low optical thickness of clouds. IFS has an average of 201 Wm^-2, and bias of 51 Wm^-2, while ICON had an unrealistic east-west gradient, and bias of 43 Wm^-2, which pointed to problems with clouds at other levels or with cloud optical thickness in ICON. UKMO had the largest overestimation with 213.7Wm^-2 in the area average, and very little structure.

image

Temperature forecasts for various models image

Comparison with station observations for rainfall observations: image

jacobbieker commented 1 year ago

@dantravers @peterdudfield @JackKelly Here are some initial looks at papers that compare ICON and ECMWF in some way. The last one does it for 1-5 hour solar forecasts, while others are more broad. ICON seems to be one of the most similar forecasts to IFS, but does seem to do worse for solar forecasts compared to IFS. Another paper that included some radiation comparison in southern West Africa also gives IFS better radiation forecast, with ICON doing worse than IFS, but better than UKMO and COSMO models.

dantravers commented 1 year ago

Great thanks @jacobbieker - that's really useful.

JackKelly commented 1 year ago

Perfect - thank you Jacob!

peterdudfield commented 1 year ago

Thanks, ive asked @devsjc to add some costs analysis in here too

devsjc commented 1 year ago

ECMWF

Archive data

It seems the archive data itself is accessible for research purposes for free; following the order through it notes an ordinary subscription cost of $3000 per year but gives a tick box option Request research license with the description

Any project organised for non-commercial research purposes only. A necessary condition for the recognition of non-commercial purposes is that all the results obtained are openly available at delivery costs only, without any delay linked to commercial objectives, and that the research itself is submitted for open publication.

The licence is subject to approval and, if accepted, the Information Cost will be set to zero EUR. Any handling and delivery costs will be charged to the Licensee, which may be waived at the discretion of ECMWF.

This research license cannot be used to make a product, and isn't a given. If we get it, the cost is $0, if we don't, $3000/year.

Real time data

There is a subset of real-time data that is openly available, released with an hour's delay, which covers some parameters at a 0.4 degree resolution: https://www.ecmwf.int/en/forecasts/datasets/open-data. It does not include any radiation or cloud parameters.

Creating an order of our usual set of 10 parameters covering the EU would cost $28000/year (not including handling costs or discounts) ![image](https://github.com/openclimatefix/nwp/assets/47188100/01f99be0-ba5a-41cf-aa99-6a3e0947a084)

Other notes

Also worth noting a recent change to the storage location and compression of some of ECMWF's datasets: https://confluence.ecmwf.int/display/DAC/Decommissioning+of+ECMWF+Public+Datasets+Service **

We might be able to get a reduced fee for small businesses (<10 headcount) which halves the price: https://www.ecmwf.int/en/forecasts/accessing-forecasts/payment-rules-and-options/tariffs##rt

jacobbieker commented 1 year ago

Thanks for looking into it! For this:

There is a subset of real-time data that is openly available, released with an hour's delay, which covers some parameters at a 0.4 degree resolution: https://www.ecmwf.int/en/forecasts/datasets/open-data.

Its good to note that these parameters don't include any radiation or cloud parameters in that.

peterdudfield commented 1 year ago

One plus point for ECWMF is Herbie supports it - https://herbie.readthedocs.io/en/stable/

peterdudfield commented 1 year ago

Should we try to fill in something like this

Icon ECMWF
Global
Annual cost live
Dataset cost
API
Which performance better in literal
Other commens
jacobbieker commented 1 year ago

One plus point for ECWMF is Herbie supports it - https://herbie.readthedocs.io/en/stable/

I think only the ECMWF open data though, from looking at it, although we could probably extend it easily enough

jacobbieker commented 1 year ago

Should we try to fill in something like this

Icon ECMWF Global
Annual cost live
Dataset cost
API
Which performance better in literal
Other commens

From my understanding then:

Icon ECMWF
Global 13km 9km
Annual cost live 0 14000€
Dataset cost 0, only Europe back to 2020, Global 03/2023 $3000 per year + data cost?
API No, files available on public server Yes?
Which performance better in literal Worse Best
Other comments ICON most similar model to ECMWF vs other forecasts, all parameters available More paramters = more expensive, longer possible forecast horizons ( 5 vs 10)
dantravers commented 1 year ago

Great. Thanks all! We have a couple of major opportunities to see ourselves which is best - smartest energy and India. We should run the beat we can and pay for it if we think it’s worth it. We can discuss the time investment required to do it later this week.

devsjc commented 1 year ago

Answers to questions raised in NWP meeting:

Is UK only cheaper than full EU? Yes, over 4 times cheaper

How much would india cost? About 0.8 times the cost of EU

How much for a single site e.g. Rajasthan? Approximately $1.5k p/a

How much to include model output statistics - ensemble ones? No added cost - still in same data band

See below for screenshots.

devsjc commented 1 year ago

Realtime Global/European Cut Out costs

Cost of EU with fewer parameters: $28000 + fees (same as EU with all parameters) ![image](https://github.com/openclimatefix/nwp/assets/47188100/f9090e82-2712-4121-a88b-fd32974fa5e1)
Cost of all of india: slightly cheaper than EU at $24000 + fees ![image](https://github.com/openclimatefix/nwp/assets/47188100/6eff5726-9b16-40ee-ba12-cc6378db2473)
Cost of just Rajasthan (loosley drawn area): $1292 + fees ![image](https://github.com/openclimatefix/nwp/assets/47188100/3109f887-1b7f-41fa-8b2e-8dc06dfefff8)
Cost of just UK: $6630 + fees ![image](https://github.com/openclimatefix/nwp/assets/47188100/cf3ebadb-1af5-4d90-8e8e-6f1473802a7d)
Cost of EU + Ensemble parameters for hcc, lcc, mcc, dswrf, dlwrf: $28000 + fees (same as EU without ensemble parameters) ![image](https://github.com/openclimatefix/nwp/assets/47188100/cd33b3c4-e36f-43fd-ae56-0c6ed5e4fd1f)

Interestingly, the cost for ~300Gb per year, is the same as the cost for ~30Tb per year!

JackKelly commented 1 year ago

Great work @devsjc!

Interestingly, the cost for ~300Gb per year, is the same as the cost for ~30Tb per year!

That is interesting!

But 30 TB per year sounds like the full ensemble to me (i.e. downloading all 100 IFS ensemble members: 300 GB per member per year x 100 members = 30 TB).

Instead, I'm 80% sure that ECMWF pre-computes some basic summary stats across all 100 ensemble members (mean & std, perhaps) and allows users to "just" download the mean & std (instead of downloading every one of the 100 ensemble members). So hopefully we'd end up "just" having to handle 600 GB per year (because, for each NWP variable, we'd have a mean and a std).

devsjc commented 1 year ago

Gotcha @JackKelly - so which of these options would you pick when trying to access those? This is what I get to choose from.

Parameter options image
JackKelly commented 1 year ago

The short answer is: I don't see the "ensemble mean and spread" in that list. It's possible that ECMWF don't provide summary stats (which would be odd).

If ECMWF do do it, then we'd be looking for the ensemble mean and spread for the atmospheric ENS (i.e. the mean and spread across all 100 ENS members). Here's some details docs from ECMWF, which strongly suggest that ECMWF do compute the "ENS mean and spread": https://confluence.ecmwf.int/display/FUG/Section+8.1.2+ENS+Mean+and+Spread

We definitely don't want SEAS.

JackKelly commented 1 year ago

Actually, this ECMWF page is probably the right place to start. It gives an overview of all of the "basic ensemble products". And gives some good, concise advice. For example, on the topic of "ensemble means and spread" it says:

All ensemble members might forecast an intense low-pressure system with gale force winds, but in different positions. But in this case, the ensemble mean will only show a rather shallow spread out depression giving the impression of weak average winds. High-impact events, which in the ensemble mean appear weak or absent, can be easily overlooked, or at best regarded as less predictable.

It is essential to inspect the postage stamps and/or use probabilities in conjunction with ensemble mean

@dantravers @jacobbieker & @dfulu it might be worth quickly skim-reading ECMWF's overview of Basic ensemble products so we can make the most informed decision about which ensemble products we want to consume (if any!)