ajdawson / eofs

EOF analysis in Python
http://ajdawson.github.io/eofs/
GNU General Public License v3.0
199 stars 60 forks source link

eof not normalized and not conserving cell measures #138

Open rebeccaherman1 opened 1 year ago

rebeccaherman1 commented 1 year ago

Thank you for making this functionality for iris cubes.

The documentation claims that the EOF analysis preserves metadata, but Cell measures including cell_area are not preserved in the produced EOF. Is this the intended functionality? I am able to re-attach these supplementary variables myself, but it seems preferable to automatically retain all metadata.

More importantly, I expect EOF decomposition to yield an EOF matrix which is orthonormal, and it is not clear to me from the documentation which eofscaling is meant to yield that. I've tried to verify that the first EOF has magnitude 1 using all 3 scalings and also using weights equal to the cell_area and the square root of the cell area, and I never receive a value of 1. Could you explain to me how to extract the normalized EOF and how to verify its norm? If there is no scaling option that produces a norm of 1, then I believe there must be a bug in the code.

rebeccaherman1 commented 1 year ago

Update: I have now through many hours of experimentation and digging into the source code discovered that: (1) The returned EOFs are still weighted somehow according to cell area, and this is why they do not appear to be normalized when further considering the cell area (2) The weights are not (the square root of cell area) divided by total area $\sqrt{a{ij}}/\sum{ij} a{ij}$, but rather the square root of (the cell area divided by total area) $\sqrt{a{ij}/\sum{ij} a{ij}}$ (3) While the EOFs are returned in their orthonormal form when selecting the 'unscaled' version (eofscaling=0), the PCs are actually returned multiplied by the square root of the eigenvalue, or the singular value (self._P = A * Lh)

As I've noted in #139, reconstructedField is not currently working, and it took me a very long time to realize that I should not include the square root of the eigenvalues when using scaling=0 for EOFs and PCs and trying to reconstruct the field.

I think the documentation could be made significantly more clear.