Closed roxyboy closed 9 months ago
Hey @roxyboy! By convention, xeofs
uses PC scores normalized by the L2 norm. However, for reconstruction purposes, you'll need the "raw" (unnormalized) PC scores. You can easily get these by setting model.scores(normalized=False)
.
If you're looking for a simpler route, consider using the inverse_transform
method. This method lets you reconstruct the original field by selecting just a subset of PC modes:
pcs = model.scores()
pcs = pcs.sel(mode=slice(1, 5))
field_reconstructed = model.inverse_transform(pcs)
This method also accepts the normalized scores.
Thank so much for your quick reply!
Hello, apologies for coming back to this but could you explain why the following gives different results for the first mode EOF reconstruction?
pcsF = model.scores(normalized=False)
reEOF_explicit = pcsF * model.components()
fig, (ax1,ax2) = plt.subplots(figsize=(10,4), nrows=1, ncols=2)
reEOF_explicit.isel(mode=0).real.plot(ax=ax1)
pcs = model.scores()
model.inverse_transform(pcs.isel(mode=0)).real.plot(ax=ax2)
Somewhat related to my previous post, could you explain why selecting the mode as opposed to slicing the modes produces different results...?
fig, (ax1,ax2) = plt.subplots(figsize=(10,4), nrows=1, ncols=2)
model.inverse_transform(pcs.isel(mode=0)).real.plot(ax=ax1)
model.inverse_transform(pcs.isel(mode=slice(None,1))).real.plot(ax=ax2)
Hey, no worries! So regarding your first question I think there's a little mistake in the reconstruction. You actually want to use the dot product between the PC scores and the eigenvectors, but you use element-wise multiplication. Also, you have to take care of scaling your reconstructed data (mean, standard deviation, other weights etc.)
So a minimal example for reconstructing the original data could like the following:
import xarray as xr
from xeofs.models import EOF
X = xr.tutorial.open_dataset("air_temperature")["air"]
Xmean = X.mean(dim="time") # store the mean for reconstruction
eof = EOF(n_modes=3)
eof.fit(X, dim="time")
comps = eof.components()
scores = eof.scores(normalized=False)
# Reconstruction by xeofs
Xrec1 = eof.inverse_transform(scores, normalized=False)
# Our own reconstruction by hand
Xrec2 = xr.dot(comps, scores, dims="mode")
Xrec2 = Xrec2 + Xmean
Xrec2 = Xrec2.transpose("time", "lat", "lon") # just ensure the same order of dims
xr.testing.assert_allclose(Xrec1, Xrec2)
Do that with your own data, it should work! =)
As to your second question, I cannot reproduce your result. Would you mind sharing a minimal working example that reproduces your observation? Continuing the example from above, the following
pcs1 = scores.isel(mode=[0]) # <-- use [0] instead of 0 to ensure that the pcs are 2D
pcs2 = scores.isel(mode=slice(None, 1))
xr.testing.assert_allclose(pcs1, pcs2)
Xrec1 = eof.inverse_transform(pcs1, normalized=False)
Xrec2 = eof.inverse_transform(pcs2, normalized=False)
xr.testing.assert_allclose(Xrec1, Xrec2)
works for me! Note, however, that I used mode=[0]
to ensure that the PCs are 2D (we keep the mode dimension). Although the numerical reconstruction works also with mode=0
, you will notice that Xrec1
has an additional mode
dimension which is absent in Xrec2
. So in that case xr.testing.assert_allclose
will complain about the misaligned dimensions.
Thank you again for you quick reply. Below is a minimal example that I've been testing my code out with:
def f1(x,t):
return xr.DataArray(1./np.cosh(x[np.newaxis,:]+3)*np.exp(2.3j*t[:,np.newaxis]),
coords=[('time',t),('x',x)])
def f2(x,t):
return xr.DataArray(2./np.cosh(x[np.newaxis,:])*np.tanh(x)*np.exp(2.8j*t[:,np.newaxis]),
coords=[('time',t),('x',x)])
x = np.linspace(-5, 5, 128)
t = np.linspace(0, 4*np.pi, 256)
X1 = f1(x, t)
X2 = f2(x, t)
X = X1 + X2
This gives the dataset as plotted below for its real part X.pdf
I then apply the EOFs
from xeofs.models import EOF
model = EOF(n_modes=4)
Xmean = X.mean('time')
model.fit(X, dim=("time"))
expvar = model.explained_variance_ratio()
components = model.components()
pcsF = model.scores(normalized=False)
pcs = model.scores()
# Reconstruction by hand
reEOF_explicit = xr.dot(pcsF, components, dims='mode')
reEOF_explicit += Xmean
# xeofs reconstruction
reEOF = model.inverse_transform(pcs)
# Reconstruction of mode 1
reEOF0 = model.inverse_transform(pcs.isel(mode=slice(None,1)))
reEOF0_ex0 = model.inverse_transform(pcs.isel(mode=0))
reEOF0_ex1 = model.inverse_transform(pcs.isel(mode=[0]))
When plotting the following
fig, (ax1,ax2) = plt.subplots(figsize=(10,4), nrows=1, ncols=2)
(reEOF_explicit + Xmean).real.plot(ax=ax1)
(reEOF).real.plot(ax=ax2)
ax1.set_title("Recon. by hand")
ax2.set_title("xeofs recon.")
they look different where the left panel is essentially missing the contribution from the third EOF mode.
Also, when comparing the mode 1 reconstruction
fig, (ax1,ax2,ax3) = plt.subplots(figsize=(15,4), nrows=1, ncols=3)
(reEOF0).real.plot(ax=ax1)
(reEOF0_ex0).real.plot(ax=ax2)
(reEOF0_ex1).real.plot(ax=ax3)
ax1.set_title("Slicing")
ax2.set_title("Index")
ax3.set_title("List")
I get the following
where the first and third panel agree with each other but the middle panel doesn't...
Thanks @roxyboy for the example! That makes things a bit clearer :) Specifically, I didn't catch from your first post that you were working with complex input data. In this case, the correct approach to reconstruction involves using the conjugate transpose, so your example becomes:
# Reconstruction by hand
reEOF_explicit = xr.dot(pcsF, components.conj(), dims='mode')
reEOF_explicit += Xmean
This operation is automatically handled by xeofs
, addressing the discrepancies you've noticed in the real part.
However, since xeofs
was originally not designed to handle complex input data, reconstructed fields are always enforced to be real, i.e. the imaginary parts are always removed. This is why the imaginary parts of the reconstructions still won't match. Luckily, xeofs
can already handle complex data internally, so extending support for complex input data seems straightforward. I'll try to work on that the coming days.
Regarding your second point: I found a little bug in the code that occurs when you use normalized PCs together with a single mode (e.g. `.isel(mode=0)') in order to reconstruct the data. Using either unormalized PCs or more than 1 mode, the differences disappear, e.g. below all the results are the same:
reEOF0 = model.inverse_transform(pcsF.isel(mode=slice(None,1)), normalized=False)
reEOF0_ex0 = model.inverse_transform(pcsF.isel(mode=0), normalized=False)
reEOF0_ex1 = model.inverse_transform(pcsF.isel(mode=[0]), normalized=False)
I'll make a patch to fix this - in the meantime just specify mode=[0]
and you're on the safe side :)
@roxyboy with the new release you should be able to reconstruct the real part even when using isel(mode=0)
. Note though that your example above will only completely work when complex input data is fully supported. Follow #150 for checking progress on that.
@nicrie Thank you so much!
Hi, thank you for developing this awesome package! This is probably a stupid question but I'm having trouble reconstructing the input data from the EOF modes. I would have naively thought that this could be done by
(model.components() * model.scores()).sum('modes')
but the order of magnitude seems to be off...