google-deepmind / graphcast

Apache License 2.0
4.36k stars 537 forks source link

About loss weights #54

Closed ZhuShengchen closed 5 months ago

ZhuShengchen commented 5 months ago

Great job! I have a question: How are the weights of different variables in the loss function determined? In other words, how did you obtain the current loss function weights?

tewalds commented 5 months ago

Some of the weighting is principled, like the area weighting correction by latitude to correct for the grid inputs/outputs over a sphere. Much of the rest is not especially optimized. The pressure level weights were chosen to bias towards ground level and worked well on 13 pressure levels, but weren't revisited when we switched to 37 levels. There are likely better weightings possible, as shown by us not doing especially well in the stratosphere relative to HRES. We started with equal weighting for surface/atmospheric variables, but found 0.1 for surface variables worked a bit better, other than t2m which was hurt by such a low weight. We didn't especially scan over many possible weightings, so there are almost certainly better weightings. What exactly is best also rather depends on your goal.

ZhuShengchen commented 5 months ago

Have you ever tried L1 loss? I have experimented with our own model, using both L1 and MSE losses without weighting for pressure levels and surface variables. The comparison was made with the models GraphCast and Pangu. Interestingly, L1 tends to favor the pressure levels over the surface variables, while MSE shows the opposite trend.

mjwillson commented 5 months ago

I believe we tried L1 loss further back in this project and didn't see a clear benefit from it, I don't think we did a detailed breakdown by variable and level though so this is interesting and may be worth revisiting at some point. Anyway closing this issue as I think the main question is answered.