r-spatial / gstat

Spatial and spatio-temporal geostatistical modelling, prediction and simulation
http://r-spatial.github.io/gstat/
GNU General Public License v2.0
195 stars 49 forks source link

abort session fatal error when using variogram() #91

Closed KatLeigh11 closed 10 months ago

KatLeigh11 commented 3 years ago

I've NO CLUE what's going on. If I run gstat::variogram() on a single, isolated dataframe in my data, it works fine. But as soon as I try to use it on multiple within lapply (see code #1), it causes R to crash (aka abort session fatal error pop-up). This also happens when I write a for loop version of it, and when I try to run it on a dataframe selected from within the larger list (see code #2). Please help!

My data: (see nested list file here: https://drive.google.com/file/d/19J2uLMg0EJI8ewFe1ISSdKX-9xSrWOhJ/view?usp=sharing)

My code #1:

  ## Make a variogram:       

  apply_var <- lapply(X=split_df_mat,
                      FUN = function(x){
                        v <- variogram(
                          object = abundance_tot ~ neg_dist_sum + mon,
                                      data= x)
                        })

My code #2:

var_1 <- variogram(object = abundance_tot ~ neg_dist_sum + mon, data= split_df_mat[[1]])

Note: you can open the linked file by inserting this code in R " split_df_mat <- readRDS('.split_df_mat.Rds') "

edzer commented 3 years ago

The error message you get clearly points to a bug in gstat, but what you're trying to do is essentially trying to compute residuals from a regression model with two predictors on datasets with too few observations. If you'd constrain your doing this to observation sets with sufficient observations, things seem to work, e.g.

  ## Make a variogram:       

library(gstat)
split_df_mat <- readRDS('split_df_mat.Rds')

sel = sapply(split_df_mat, nrow) > 10 

apply_var <- lapply(X=split_df_mat[sel],
                      FUN = function(x){
                        variogram(
                          object = abundance_tot ~ neg_dist_sum + mon,
                                      data= x)
                        })

(although 10 is still a very small number)

Also, it looks like you have geographical coordinates (in degrees), but you don't tell gstat this; if so you need to set st_crs(x) = 4326 just before the call to variogram().