r-spatial / gstat

Spatial and spatio-temporal geostatistical modelling, prediction and simulation
http://r-spatial.github.io/gstat/
GNU General Public License v2.0
194 stars 49 forks source link

Variable transformation give weird variance results when using kriging? #137

Open Cidree opened 6 months ago

Cidree commented 6 months ago

Hello,

I am analysing the interpolation techniques, and I encountered something that appears to be weird for me (I might be wrong).

I am using here 100 random points in Spain where the precipitation has been measured. The precipitation does not follow a normal distribution, so I did a log-transformation and then it follows a normal distribution.

Here, I did two analysis, one with the non-transformed variable and another with the transformed variable. When using ordinary kriging, the predictions of the precipitation are consistent in both cases. However, the variance it's significantly different between both methods.

I tried this with other examples, including meuse data, and the range of the variance are always weird when I use a transformation. Here I paste the code that I am using for transforming the variable, and invert the transformation. If you need the full code for a reproducible example, please let me know.

## Transform precipitation
precipitacion_sf <- precipitacion_sf %>%
  mutate(
    prec_trans = log(prec)
  )

## Fit variogram
prec_var <- variogram(
  object    = prec_trans ~ 1,
  locations = precipitacion_sf
)

prec_opt_vgm <- fit.variogram(
  object = prec_var,
  model  = vgm(c("Sph", "Gau", "Exp"))
)

## Ordinary kriging
prec_ko_sf <- krige(
  formula   = prec_trans ~ 1,
  locations = precipitacion_sf,
  newdata   = grid_sfc,
  model     = prec_opt_vgm
)

## Invert transformation
prec_ko_sf <- prec_ko_sf %>%
  mutate(
    var1.pred = exp(var1.pred),
    var1.var  = exp(var1.var),
  )
edzer commented 6 months ago

The back-transform is not a simple exp() for the log-normal distribution, see e.g. here; you can look into gstat::krigeTg(), with lambda=0 Box-Cox is the log-transform.

Cidree commented 6 months ago

The back-transform is not a simple exp() for the log-normal distribution, see e.g. here; you can look into gstat::krigeTg(), with lambda=0 Box-Cox is the log-transform.

I am not sure if I got it. When using krigeTg() with the argument lambda, we apply the Box-Cox transformation. The function takes it into account internally, so the var1TG.pred and var1TG.var were computed using the Box-Cox transformation, and transformed back to the original scale. Is this right, or am I misinterpreting it?

I am a bit confused from this, because most of the examples I found using the meuse dataset, people use (and I was also taught to do this):

logzinc_krige <- krige(formula = log(zinc) ~ 1,
             meuse,
             meuse.grid,
             model = logzinc_vgm)

And after that, consider the back-transformation as:

logzinc_krige %>% 
  mutate(
    var1.pred = exp(var1.pred),
    var1.var  = exp(var1.var),
  )

Which is the same I did in my example of the precipitation in southern Spain. So the correct, would be to use this and use var1TG.pred and var1TG.var?

logzinc_krige <- krigeTg(formula = log(zinc) ~ 1,
             meuse,
             meuse.grid,
             model = logzinc_vgm),
             lambda = 0

I am sorry for these questions here, but I feel like I have been using this wrong for months.

edzer commented 5 months ago

Have you looked into the documentation and examples of krigeTg?