google-deepmind / graphcast

Apache License 2.0
4.36k stars 537 forks source link

Develop an hourly model #37

Open erinboyle opened 6 months ago

erinboyle commented 6 months ago

Thanks for and congratulations on the great work here!

I'm probably not the only person excited to use GraphCast but limited by the 6 hour resolution. I work in the energy sector. Forecasting the shape of electricity prices across the day matters a lot to battery revenue, and is highly dependent on weather. Accuracy at ~hourly resolution probably determines when generic weather apps can switch to new models, too.

So, the feature request is to develop a model with hourly rather than 6-hourly resolution. I'm sure you're already considering this but thought I'd submit a formal issue so those of us interested can follow along.

abhinavyesss commented 6 months ago

When predicting we have to enter certain 'forcing' values, if those values are t, t+1, t+2... hours' values instead of t, t+6, t+12.... wouldn't that generate hourly values.

Tlmnk commented 6 months ago

An hourly model would indeed be very interesting for several use-cases. Assuming this model would be used for more short-term predicitons (0-48 hours ahead) it would further be great to optimize the forecast accuracy for 1 day lead-times instead of the 3.5 day lead-time it is currently optimized for.

@abhinavyesss : I think in order to produce meaningful hourly forecasts the model would have to be retrained on hourly ERA5 data instead of just adjusting the 'forcing' values.

@erinboyle : note that for commercial use cases like the energy market there would still be licensing issues since the model weights are published under the non-commercial CC BY-NC-SA 4.0 license.

tewalds commented 6 months ago

The challenge with training a 1 hour model is that we don't have good 1 hour data. ERA5 does indeed have 1 hour data, but it only incorporates new observations every 12 hours, so the other 11 hours are just predictions from the 2016 HRES model. This makes a model trained on 6 hour intervals have a weird learning problem where half the time it's learning real physics and half the time it's learning NWP model physics. This is somewhat (mainly?) mitigated by fine-tuning on the HRES dataset that incorporates real observations every 6 hours. Unfortunately we don't have any dataset that incorporates observations every 1 hour. It's certainly possible to train a model that trains on the 1 hour ERA5 data, but I wouldn't expect it to outperform HRES for the first 12 hours. It may outperform HRES when taking a better graphcast prediction as its input, but that is likely also out of distribution (due to blurring), so even that is not guaranteed. Either way, I agree that this would be a useful thing to have, but has a fair amount of challenges to do well.

shermansiu commented 6 months ago

My guess is that MetNet-3 is more suitable for higher time resolution. Unfortunately, there are no official plans to open source MetNet-3.