r-spatial / gstat

Spatial and spatio-temporal geostatistical modelling, prediction and simulation
http://r-spatial.github.io/gstat/
GNU General Public License v2.0
194 stars 49 forks source link

An error report that doesn't specify which columns are not specified in gstat #111

Open FanLukeLi opened 2 years ago

FanLukeLi commented 2 years ago

Hello Dr.Pebesma, this reproducible example should return an error report that doesn't specify clearly what columns are not defined, could you tell what is in lack in the definition of which data frame? Thank you for your time.

library(reprex)
library(sf)
#> Linking to GEOS 3.9.1, GDAL 3.2.1, PROJ 7.2.1
library(tidyverse)
library(gstat)
#> Warning: package 'gstat' was built under R version 4.1.3
library(sp)
library(spacetime)
#> Warning: package 'spacetime' was built under R version 4.1.3
library(raster)
#> 
#> Attaching package: 'raster'
#> The following object is masked from 'package:dplyr':
#> 
#>     select
#> The following object is masked from 'package:tidyr':
#> 
#>     extract
library(rgdal)
#> Please note that rgdal will be retired by the end of 2023,
#> plan transition to sf/stars/terra functions using GDAL and PROJ
#> at your earliest convenience.
#> 
#> rgdal: version: 1.5-27, (SVN revision 1148)
#> Geospatial Data Abstraction Library extensions to R successfully loaded
#> Loaded GDAL runtime: GDAL 3.2.1, released 2020/12/29
#> Path to GDAL shared files: C:/Users/Fan Li/Documents/R/win-library/4.1/rgdal/gdal
#> GDAL binary built with GEOS: TRUE 
#> Loaded PROJ runtime: Rel. 7.2.1, January 1st, 2021, [PJ_VERSION: 721]
#> Path to PROJ shared files: C:/Users/Fan Li/Documents/R/win-library/4.1/rgdal/proj
#> PROJ CDN enabled: FALSE
#> Linking to sp version:1.4-5
#> To mute warnings of possible GDAL/OSR exportToProj4() degradation,
#> use options("rgdal_show_exportToProj4_warnings"="none") before loading sp or rgdal.
#> Overwritten PROJ_LIB was C:/Users/Fan Li/Documents/R/win-library/4.1/rgdal/proj
library(rgeos)
#> rgeos version: 0.5-8, (SVN revision 679)
#>  GEOS runtime version: 3.9.1-CAPI-1.14.2 
#>  Please note that rgeos will be retired by the end of 2023,
#> plan transition to sf functions using GEOS at your earliest convenience.
#>  GEOS using OverlayNG
#>  Linking to sp version: 1.4-5 
#>  Polygon checking: TRUE

data <- read.csv("data/Synthesis_of_Environmental_Mercury_Loads_in_New_York_State__1969-2017___Chemical_Data.csv")
#> Warning in file(file, "rt"): cannot open file 'data/
#> Synthesis_of_Environmental_Mercury_Loads_in_New_York_State__1969-2017___Chemical_Data.csv':
#> No such file or directory
#> Error in file(file, "rt"): cannot open the connection

dataProcess <- function(){
  names(data)[1] <- "SAMPLE_ID"
  data$Chem_Units[data$Chem_Units == "?g/l"] = "mcg/l"
  data_df <- data %>% 
    dplyr::select("SAMPLE_ID", "Latitude", "Longitude", "BDate", "Year", "TissueCollected", "Chem_Value", "Chem_Units", "Chemical_Type", "Final_Chem_Standardized") %>% 
    filter(Chemical_Type == "THg") %>% 
    filter(Chem_Units == "mcg/l" | Chem_Units == "ng/l") %>% 
    na.omit()
  return(data_df)
}

data_df <- dataProcess()
#> Error in names(data) <- `*vtmp*`: names() applied to a non-vector

data_water <- subset(data_df, TissueCollected == 'Water' | TissueCollected == 'Surface Water' | TissueCollected == 'Groundwater' | TissueCollected == "Treated water supply" | TissueCollected == "Untreated water supply")
#> Error in h(simpleError(msg, call)): error in evaluating the argument 'x' in selecting a method for function 'subset': object 'data_df' not found
sptTransform <- function(data.df) {
  data.df$date <- as.POSIXlt.Date(as.Date(data.df$BDate, format = "%m/%d/%Y"), origin = "1970-01-01 EDT")
  data.df <- subset(data.df, select = -c(BDate, Year))
  data.df$Final_Chem_Standardized <- data.df$Final_Chem_Standardized * 1000000

  coordinates(data.df) = ~Longitude + Latitude
  projection(data.df) = CRS("+init=epsg:4326")

  chemData <- spTransform(data.df, CRS("+init=epsg:32617"))
  return(chemData)
}

spt_water <- sptTransform(data_water)
#> Error in as.Date(data.df$BDate, format = "%m/%d/%Y"): object 'data_water' not found

numVec <- spt_water@data$Final_Chem_Standardized
#> Error in eval(expr, envir, enclos): object 'spt_water' not found
df <- data.frame(numVec)
#> Error in data.frame(numVec): object 'numVec' not found
names(df) <- "FCS6"
#> Error in names(df) <- "FCS6": names() applied to a non-vector

stidf_water <- STIDF(sp = spt_water, 
                     time = spt_water@data$date, 
                     data = df)
#> Error in STIDF(sp = spt_water, time = spt_water@data$date, data = df): object 'spt_water' not found

stplot(stidf_water)
#> Error in h(simpleError(msg, call)): error in evaluating the argument 'obj' in selecting a method for function 'stplot': object 'stidf_water' not found

var <- variogramST(FCS6 ~ 1, data = stidf_water, tunit = "days", assumeRegular = F, na.omit = T)
#> Error in variogramST(FCS6 ~ 1, data = stidf_water, tunit = "days", assumeRegular = F, : object 'stidf_water' not found

Created on 2022-07-19 by the reprex package (v2.0.1)

Synthesis_of_Environmental_Mercury_Loads_in_New_York_State__1969-2017___Chemical_Data.zip

edzer commented 2 years ago

I've looked at it, and managed to get the commands run without error up to the last command, but then get

> var <- variogramST(FCS6 ~ 1, data = stidf_water, tunit = "days", assumeRegular = F, na.omit = T)
  |                                                                      |   0%Error in `[.data.frame`(x@data, i, j, ..., drop = FALSE) : 
  undefined columns selected

I don't know why. It may have to do that your data seem to be irregularly distributed over time; selecting a time period with more complete data may work out.

FanLukeLi commented 2 years ago

I've looked at it, and managed to get the commands run without error up to the last command, but then get

> var <- variogramST(FCS6 ~ 1, data = stidf_water, tunit = "days", assumeRegular = F, na.omit = T)
  |                                                                      |   0%Error in `[.data.frame`(x@data, i, j, ..., drop = FALSE) : 
  undefined columns selected

I don't know why. It may have to do that your data seem to be irregularly distributed over time; selecting a time period with more complete data may work out.

Hello Professor Pebesma. This error report is exactly why I'm contacting you for help. Could you please give an example of what kind of data could work? Or could you please find what is the reason that's causing this error report?