google-deepmind / graphcast

Apache License 2.0
4.63k stars 587 forks source link

Issues with performance of GraphCast operational model #85

Closed LeBronQ closed 3 months ago

LeBronQ commented 4 months ago

Hi! We are using graphcast for forecasting. We downloaded the “Complete ERA5 global atmospheric reanalysis” 0.25*0.25 longitude data from CDS, merged it with the “reanalysis-era5-single-levels” data into a one-day file, and then merged multiple one-day files for ten-day forecasting. The model parameters we used are GraphCast_operational - ERA5-HRES 1979-2021 - resolution 0.25 - pressure levels 13 - mesh 2to6 - precipitation output only.npz. We analyzed the forecast data for the United States and Brazil and found that there were relatively large errors in temperature. The errors on some days would suddenly become very large (for example, there would be high temperatures of more than 60 Celsius degree). The figure below is our experimental results of Minnesota State. What's more, the mean temperature is much lower overall, whether in Brazil or the United States. In addition, we found that the “dataset_source-hres_date-2022-01-01_res-0.25_levels-13_steps-01.nc” file provided on GCS is slightly different from the 2022-01-01 data we downloaded from CDS. I would like to ask if there is a problem with the input data? 20240628-162355

alvarosg commented 4 months ago

Thanks for your message. It is a big hard to say, but from your message. Could you confirm, is this data that wyou are downloading ERA5, data HRES Analysis data, or HRES forecasts inputs data?

You mention you are downloading ERA5, data, however HRES operational does not take ERA5 as inputs, but HRES forecasts, inputs which we usually download from the MARS repository.

So if you are comparing ERA5 data to this file: dataset_source-hres_date-2022-01-01_res-0.25_levels-13_steps-01.nc, then I would nto expect it to match.

alvarosg commented 4 months ago

Could you check if you have a good match when comparing your data to the ERA5 example data (the ones that start with source-era5), and that the forecast is good if the you use the ERA5 weights.

Otherwise it seems like the main problem is that may have to download a separate dataset form MARS to initialize the operational model.

LeBronQ commented 4 months ago

Thank you a lot for answering my question! We downloaded 0.25*0.25 levels 13 ERA5 reanalysis data from CDS website. Can we use it in "GraphCast - ERA5 1979-2017 - resolution 0.25 - pressure levels 37 - mesh 2to6 - precipitation input and output.npz" model? I see that there is "source-era5_date-2022-01-01_res-0.25_levels-13_steps-01.nc" in the GCS bucket.

LeBronQ commented 4 months ago

We have another question. This model has obtained good results in precipitation indicators. Have you tried to use ERA5 data as input for operational models before? Can you get better results in some indicators? image

alvarosg commented 3 months ago

Have you tried to use ERA5 data as input for operational models before?

We don't typically initialize operationl model son ERA5 data, because ERA5 data is not available in real time.