GeoStat-Framework / PyKrige

Kriging Toolkit for Python
https://pykrige.readthedocs.io
BSD 3-Clause "New" or "Revised" License
741 stars 186 forks source link

Returning only one value (straight line) #176

Closed FlashThunder closed 3 years ago

FlashThunder commented 3 years ago

Why doesn't it even try to interpolate the given values? just returns single value for every y: 2356.24193548.

from pykrige.ok import OrdinaryKriging
import matplotlib.pyplot as plt
import numpy as np

plt.style.use("ggplot")

xpred = np.linspace(0, 10, 1000)
y = np.array("1489,868,1655,4315,877,3181,5837,1488,872,982,1022,1107,9318,1185,1105,1063,1141,"
             "8370,1198,983,430,8256,3632,1031,5366,830,1700,1125,632,701,9034,625,1010,132,"
             "11476,713,864,9111,943,816,823,771,540,614,1297,847,877,787,2521,1800,6111,3903,"
             "2580,382,662,652,573,629,6378,297,763,5797".split(","), dtype=float)
x = np.linspace(0, 10, y.shape[0])

# compare OrdinaryKriging as an exact and non exact interpolator
uk = OrdinaryKriging(
    x, np.zeros(x.shape), y, variogram_model="linear", exact_values=False
)
uk_exact = OrdinaryKriging(x, np.zeros(x.shape), y, variogram_model="linear")

y_pred, y_std = uk.execute("grid", xpred, np.array([0.0]), backend="loop")
y_pred_exact, y_std_exact = uk_exact.execute(
    "grid", xpred, np.array([0.0]), backend="loop"
)

y_pred = np.squeeze(y_pred)
y_std = np.squeeze(y_std)

y_pred_exact = np.squeeze(y_pred_exact)
y_std_exact = np.squeeze(y_std_exact)

fig, ax = plt.subplots(1, 1, figsize=(100, 40))

ax.scatter(x, y, label="Input Data")
ax.plot(xpred, y_pred_exact, label="Exact Prediction")
ax.plot(xpred, y_pred, label="Non Exact Prediction")

ax.fill_between(
    xpred,
    y_pred - 3 * y_std,
    y_pred + 3 * y_std,
    alpha=0.3,
    label="Confidence interval",
)
ax.legend(loc=9)
ax.set_ylim(-10000, 10000)
ax.legend(loc=9)
plt.xlabel("X")
plt.ylabel("Field")
plt.show()
MuellerSeb commented 3 years ago

Having a look at the estimated variogram parameters:

>>> print(uk.variogram_model_parameters)
[1.34819582e-11 6.83151812e+06]

This reveals that a pure nugget model was estimated (flat slope, high nugget). Therefore the constant output.

Tinkering around with your input data reveals, that estimating and fitting a variogram is rather hard, since spatial correlation is hard to find. so maybe another routine for interpolating your data could be better.

FlashThunder commented 3 years ago

Oh ok, I understand that my data doesn't fit to interpolate that way. But could You give me a hint on how to fix it? And be able to use it to estimate the missing data, as well as "correct" values? The model seems to work great on examples. It is hard to make a custom variogram, as data differs. I would like to use it to fill missing data (clouds etc) on Sentinel 2 images. I mean having values from different days.