ajdawson / eofs

EOF analysis in Python
http://ajdawson.github.io/eofs/
GNU General Public License v3.0
199 stars 60 forks source link

Error in eofsAsCovariance when using weights #119

Open funkpreacher opened 3 years ago

funkpreacher commented 3 years ago

I am running the NAO example with xarray. Extracting the 1st EOF repeatedly gives different results (only when using weights). It looks like the weights are applied repeatedly.

import cartopy.crs as ccrs
import matplotlib.pyplot as plt
import numpy as np
import xarray as xr

from eofs.xarray import Eof
from eofs.examples import example_data_path

filename = example_data_path('hgt_djf.nc')
z_djf = xr.open_dataset(filename)['z']

z_djf = z_djf - z_djf.mean(dim='time')

coslat = np.cos(np.deg2rad(z_djf.coords['latitude'].values)).clip(0., 1.)
wgts = np.sqrt(coslat)[..., np.newaxis]
solver = Eof(z_djf, weights=wgts)

eof1 = solver.eofsAsCovariance(neofs=1)

clevs = np.linspace(-75, 75, 11)
proj = ccrs.Orthographic(central_longitude=-20, central_latitude=60)
ax = plt.axes(projection=proj)
ax.coastlines()
ax.set_global()
eof1[0, 0].plot.contourf(ax=ax, levels=clevs, cmap=plt.cm.RdBu_r,
                         transform=ccrs.PlateCarree(), add_colorbar=False)
ax.set_title('EOF1 expressed as covariance', fontsize=16)
plt.show()

# ========================================================================
# Extract 1st EOF again and redo the same plot
# ========================================================================
eof1 = solver.eofsAsCovariance(neofs=1)

clevs = np.linspace(-75, 75, 11)
proj = ccrs.Orthographic(central_longitude=-20, central_latitude=60)
ax = plt.axes(projection=proj)
ax.coastlines()
ax.set_global()
eof1[0, 0].plot.contourf(ax=ax, levels=clevs, cmap=plt.cm.RdBu_r,
                         transform=ccrs.PlateCarree(), add_colorbar=False)
ax.set_title('EOF1 expressed as covariance', fontsize=16)
plt.show()

print(eof1-eof1b)

Result of difference between eof1 and eof1b:

<xarray.DataArray 'eofs' (mode: 1, pressure: 1, latitude: 29, longitude: 49)>
array([[[[ 1.09211228e-01,  9.13365940e-02,  7.00722281e-02, ...,
           1.05908773e-01,  1.25983739e-01,  1.41891558e-01],
         [ 2.34846658e-01,  2.13107734e-01,  1.85683208e-01, ...,
           5.54756315e-02,  9.04768289e-02,  1.16930300e-01],
         [ 4.52193967e-01,  4.28573805e-01,  3.97803594e-01, ...,
          -9.20446292e-02, -4.41052427e-02, -7.67394121e-03],
         ...,
         [-5.45899433e+01, -5.52156445e+01, -5.58994769e+01, ...,
          -6.28602958e+01, -6.25882936e+01, -6.22965706e+01],
         [-8.68933483e+01, -8.72562317e+01, -8.76033326e+01, ...,
          -9.52351039e+01, -9.51051557e+01, -9.50488148e+01],
         [            nan,             nan,             nan, ...,
                      nan,             nan,             nan]]]])
Coordinates:
  * mode       (mode) int64 0
  * pressure   (pressure) float32 500.0
  * latitude   (latitude) float32 20.0 22.5 25.0 27.5 ... 82.5 85.0 87.5 90.0
  * longitude  (longitude) float32 -80.0 -77.5 -75.0 -72.5 ... 35.0 37.5 40.0
fcodron commented 3 years ago

Same issue (standard interface). Looking at the source code, what seems to happen is every time you call "eofsAsCovariance", the original data is divided by the weights. So you can do it only once, or you must run the solver again... Other functions are not affected as they do not use the original data (only the PCs / EOFs), or weighting doesn't matter in the case of correlation.

Potential solutions would be