MESMER-group / mesmer

spatially-resolved ESM-specific multi-scenario initial-condition ensemble emulator
https://mesmer-emulator.readthedocs.io/en/latest/
GNU General Public License v3.0
22 stars 15 forks source link

allow nans in generate harmonic model sum #433

Closed veni-vidi-vici-dormivi closed 1 month ago

veni-vidi-vici-dormivi commented 2 months ago

In the harmonic model we return the fit coefficient array with nans, for coefficients, higher than the order we chose. We need to fill the array up until the max order, so we have coefficient arrays of the same length for all gridcells, so filling the array up with nans for unfit coefficients might have made sense in the beginning. However, imo it makes more sense to fill it up with zeros because in this way one can still use the fit coefficient array to generate_fourier_series and it makes testing easier, see #431.

codecov[bot] commented 2 months ago

Codecov Report

Attention: Patch coverage is 0% with 1 lines in your changes are missing coverage. Please review.

Project coverage is 82.03%. Comparing base (9b0b76b) to head (f848c5e). Report is 29 commits behind head on main.

Files Patch % Lines
mesmer/mesmer_m/harmonic_model.py 0.00% 1 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #433 +/- ## ========================================== - Coverage 87.90% 82.03% -5.88% ========================================== Files 40 44 +4 Lines 1745 1937 +192 ========================================== + Hits 1534 1589 +55 - Misses 211 348 +137 ``` | [Flag](https://app.codecov.io/gh/MESMER-group/mesmer/pull/433/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=MESMER-group) | Coverage Δ | | |---|---|---| | [unittests](https://app.codecov.io/gh/MESMER-group/mesmer/pull/433/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=MESMER-group) | `82.03% <0.00%> (-5.88%)` | :arrow_down: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=MESMER-group#carryforward-flags-in-the-pull-request-comment) to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

mathause commented 2 months ago

Unsure about this. Before we merge - would it work with nan if we update

https://github.com/MESMER-group/mesmer/blob/2fdb3d585f9eaf949f15ef3d325347a0ca1d8827/mesmer/mesmer_m/harmonic_model.py#L44

to np.nansum(..., axis=)?

mathause commented 2 months ago

(I like the beauty of 0 but nan is more explicit.)

veni-vidi-vici-dormivi commented 2 months ago

No, because we can't multiply a sequence (yearly_T) with nan.

mathause commented 2 months ago

Doesn't that just result in nan? Where in the code) is that?

(Not trying to be obnoxious, maybe just a little slow. Ah and nothing weird looking at code on Saturday morning at 7 during my holidays 🙃 (I am waiting for my flight))

veni-vidi-vici-dormivi commented 2 months ago

Hm okay not sure what I did last time but you are right, it works. Hope you got home safe! (I for my part am in HO bc I caught Lukas cold last week, but it's not too bad, just don't wanna contaminate anyone).

veni-vidi-vici-dormivi commented 1 month ago

I am quite sure we need np.nansum(..., axis=0)

Why? We only have data in one axis, it does not matter if we sum over axis 0 or all of them. And since it is only used internally this is guaranteed?

veni-vidi-vici-dormivi commented 1 month ago

So you were right, we need axis = 0 because the list actually gets a new dimension for each order. I didn't reload the notebook properly when testing it. Fixing this and finishing work on the tests next 🤦‍♀️

veni-vidi-vici-dormivi commented 1 month ago

Argh no, still wrong. So the reason why I originally had the idea to do it with zeros is because when testing if the harmonic model outputs the "right" order, I realized that sometimes the model will actually output a higher order than the one that an artificial, perfect Fourier series was generated with. However, whenever that happens, the coefficients for the additional order will be very small (aka close to 0). This way, the predictions will still be close to the original series. This means, it does not make sense to test if the order of the fitted model is the original order. What we can test for sure however is if the predictions are close. I implemented that.

I made my peace with the fact that we can't test the order but wanted to at least test if the coefficients were the "right" ones. Turns out that the model is however not able reconstruct the original coefficients because we fit for $a*T_y + b$ so any linear combination that comes out to the same will be a perfect fit. So we can't test for the coefficients either. Thus, I wanted to at least test if the linear combination of the coefficients with the yearly predictor is close to the original linear combination of yearly predictor and coefficients. This however also does not work in the case when the fit outputs an additional order, because that will be compared against nans. Thus, having zeroes for the non-fit coefficients would bypass that problem and we could test this.

I realize now that maybe it is fine to just test if the predictions are close and not test for the order and coefficients in such a conceptual way but test numerical stability of them with fixed data. I kind of got wound up in all the PRs and things I wanted to implement that this now got messed up. Also, I found it really cool to test the model in such a mathematical way but in hindsight I realize that this is maybe not the most useful way here.

So, my final question is @mathause, should we actually make them zeros and test the linear combinations or should we just leave it like this and scrap the linear combination test. A test for numerical stability will have to be added in any case.

mathause commented 1 month ago

Interesting - it might be necessary to discuss that in person. I hope to be "fit" again soon...

Just as a small comment - if we don't fit for $a$ and $b$ (but manually set it to $0$ and $1$) we should subtract $T_y$ and fit on the residuals. Would that help here?

veni-vidi-vici-dormivi commented 1 month ago

Yeah I think that's a good idea. No, I'm sorry $a$ and $b$ are not the ones that control the power of the trend (we did set them to 0 and 1 manually and don't fit them) but they stand for the coefficients in each amplitude of sine and cosine terms in the Fourier series. And I think we do fit on the residuals (see discussion in #443).

$$ T{m,s,y} = \beta{0,s} + \beta{1,s} \cdot T{s,y} + \sum{i = 1}^{n=6} [(a{1,i,s} + b{1,i,s} \cdot T{s,y}) \cdot sin(\frac{i \pi m}{6}) + (a{2,i,s} + b{2,i,s} \cdot T_{s,y}) \cdot cos(\frac{i \pi m}{6})] $$

mathause commented 1 month ago

Oh right I think I understand now. So given $c = (a + b \cdot T)$ you say that there are different combinations of $a$ and $b$ leading to the same $c$?

veni-vidi-vici-dormivi commented 1 month ago

Exactly.