Open mathause opened 4 days ago
Shouldn't n_scen
be n_ens
or n_runs
? Or do you actually mean n_scen
because if you were to weigh each sample by 1/n_scen
scenarios with more members would be overrepresented.
Shouldn't
n_scen
ben_ens
orn_runs
? Or do you actually meann_scen
because if you were to weigh each sample by1/n_scen
scenarios with more members would be overrepresented.
Yes you are right - I mean n_ens
. I'll correct it above.
I always thought that the scenario weights applied to the linear regression is given by
1 / (n_ens * n_ts)
. However it's1 / n_ens
. I probably miss-interpreted this. The original code (v0.8.0) is here:https://github.com/MESMER-group/mesmer/blob/13f048b1106faf302755a6181358243b43fffb5b/mesmer/calibrate_mesmer/train_utils.py#L50-L58
I refactored this in #143 and adapted the comment to
https://github.com/MESMER-group/mesmer/blob/456776d4a318e50bc7f642f097354c76c24a21fc/mesmer/calibrate_mesmer/train_utils.py#L14-L15
but importantly the code stayed the same:
https://github.com/MESMER-group/mesmer/blob/456776d4a318e50bc7f642f097354c76c24a21fc/mesmer/calibrate_mesmer/train_utils.py#L39
From Beusch et al. (2022):
I think it's not 100% clear - you could argue that the historical scenario does get a bit more weight as it has more time steps. But saying the weight is
1 / n_ens
is a just-as-valid interpretation of "equal weight for each scenario". So in conclusion there is nothing to do here (except maybe to adapt my comment).Originally commented in https://github.com/MESMER-group/mesmer/pull/567#pullrequestreview-2464567678
edit: corrected
n_scen
->n_ens