RolnickLab / ClimateSet

A Large-Scale Climate Model Dataset for Machine Learning
GNU General Public License v3.0
21 stars 6 forks source link

Latitude weighted RMSE not picking up the correct dimension #21

Open hirasaleem0703 opened 2 months ago

hirasaleem0703 commented 2 months ago

Since the model is receiving input of [batch, seq, vars, lat (96), lon(144)] it should be lat_size = y.shape[-2] This is the case if channels_last is set to False.

def LLWeighted_RMSE_WheatherBench(preds: np.ndarray, y: np.ndarray): """ Weigthed RMSE taken from Wheather Bench. Weighting to account for decreasing grid sizes towards the pole.

rmse = mean over forecasts and time of np.sqrt( mean over lon lat L(lat_j)*)MSE(preds, y)
weights = cos(latitude)/cos(latitude).mean()
"""
lat_size = y.shape[-1]
lats = np.linspace(-90, 90, lat_size)

weights = (np.cos(lats) / np.cos(lats)).mean()

rmse = np.sqrt(np.mean(weights * ((preds - y) ** 2), axis=(-1, -2))).mean()

return rmse
liellnima commented 2 months ago

Hi :)

Please refer to issue #12 for this. Setting the channels_last to False does result in wrong numbers, and should not be used to fix this issue.

There are several things involved here as you can see over there - I will let you know here as well when the issue is fixed!

hirasaleem0703 commented 2 months ago

I tried fixing it by changing the callbacks function: split_vector_by_variable changed this to splitted_vector[var_name] = vector[..., var_channel_limits["start"] : var_channel_limits["end"]] to this splitted_vector[var_name] = vector[:,:, var_channel_limits["start"] : var_channel_limits["end"], :,:]

Then it results in the correct shape