edzer / spacetime

Classes and methods for spatio-temporal data
73 stars 20 forks source link

Create STIDF directly #30

Open famuvie opened 6 years ago

famuvie commented 6 years ago

I have a "long" table with irregular data (observations for some municipalities at specific months) that I would like to store together with the administrative map of the region (SpatialPolygons). I tried unsuccessfully to directly create the STIDF object using both stConstruct() and STIDF.

In all the related examples in the package you first create a STFDF which is later coerced to STIDF. But my dataset is very sparse.

Here is a reproducible example adapted from the examples in the package. If I use the complete table it leads to a STFDF object as expected and it works fine. However if the table is not complete it should switch to a STIDF representation (as far as I understand from the JSS paper).


library(spacetime)
library(sp)
sp = cbind(x = c(0,0,1), y = c(0,1,1))
row.names(sp) = paste0("point", 1:nrow(sp))
sp = SpatialPoints(sp)
time = as.POSIXct("2010-08-05")+3600*(10:13)
m = c(10,20,30) # means for each of the 3 point locations
mydata = rnorm(length(sp)*length(time),mean=rep(m, 4))

mydata = data.frame(
  space = rep(row.names(sp), 4),
  time  = rep(time, each = 3),
  values = signif(mydata,3)
)

mystfdf <- stConstruct(mydata, "space", "time", sp)  # This works OK

mystidf <- stConstruct(mydata[-1,], "space", "time", sp)
#> Error: nrow(object@time) == length(object@sp) is not TRUE
famuvie commented 6 years ago

Tracing the issue, I saw that it successfully creates a ST object. But when it gets more specific with new("STI", st) it expects to find the same number of time and spatial records.

I tried to replicate the polygons to match the number of rows in the dataset, but the subsetting method for SpatialPolygons complains about replicating polygons.

I also tried to provide a numerical variable in "space" representing polygon indices, but didn't help.

famuvie commented 6 years ago

I found a workaround. Here is a new reproducible example with polygons rather than points to better illustrate the issues. At the end I propose some suggestions.

library(sp)
library(spacetime)
## Define 3 polygons (example from vignette("sp"))
Sr1 = Polygon(cbind(c(2,4,4,1,2),c(2,3,5,4,2)))
Sr2 = Polygon(cbind(c(5,4,2,5),c(2,3,2,2)))
Sr3 = Polygon(cbind(c(4,4,5,10,4),c(5,3,2,5,5)))
Sr4 = Polygon(cbind(c(5,6,6,5,5),c(4,4,3,3,4)), hole = TRUE)

Srs1 = Polygons(list(Sr1), "s1")
Srs2 = Polygons(list(Sr2), "s2")
Srs3 = Polygons(list(Sr3, Sr4), "s3/4")
SpP = SpatialPolygons(list(Srs1,Srs2,Srs3), 1:3)

## Define 4 time points
time = as.POSIXct("2010-08-05")+3600*(10:13)

## Full rectangular grid of observations
m = c(10,20,30) # means for each of the 3 polygons
myvalues = rnorm(length(SpP)*length(time), mean=rep(m, 4))

myfulldata = data.frame(
  space = rep(row.names(SpP), 4),
  time  = rep(time, each = 3),
  values = signif(myvalues,3)
)

## STFDF is Ok
mystfdf <- stConstruct(myfulldata, "space", "time", SpP)

## STIDF not Ok
obs_idx <- c(1:2, 5, 7:9, 11:12)
mysparsedata <- myfulldata[obs_idx, ]
mystidf <- stConstruct(mysparsedata, "space", "time", SpP)
#> Error: nrow(object@time) == length(object@sp) is not TRUE
# Error: nrow(object@time) == length(object@sp) is not TRUE

## stConstruct expects that the number of observations in mysparsedata
## equals the number of spatial objects in SpP
## However, I cannot replicate polygons:
mystidf <- stConstruct(mysparsedata, "space", "time", SpP[mysparsedata$space])
#> Error in SpP[mysparsedata$space]: SpatialPolygons selection: can't find plot order if polygons are replicated
# Error in SpP[mydata$space] : 
#   SpatialPolygons selection: can't find plot order if polygons are replicated

## Nor I can call STIDF directly for the same reason
mystidf <- STIDF(SpP[mysparsedata$space], mysparsedata$time,
                 mysparsedata[, "values", drop = F])
#> Error in SpP[mysparsedata$space]: SpatialPolygons selection: can't find plot order if polygons are replicated

## Nor I can use rbind's makeUniqueIDs, because it messes up the data
mystidf <- STIDF(rbind(SpP,SpP,SpP,SpP, makeUniqueIDs = TRUE),
                 mysparsedata$time, mysparsedata[, "values", drop = F])
cbind(mysparsedata[, 1:3], as.data.frame(mystidf)[, 3])
#>    space                time values as.data.frame(mystidf)[, 3]
#> 1     s1 2010-08-05 10:00:00   10.6                          s1
#> 2     s2 2010-08-05 10:00:00   19.2                          s2
#> 5     s2 2010-08-05 11:00:00   22.5                        s3/4
#> 7     s1 2010-08-05 12:00:00   11.6                         s11
#> 8     s2 2010-08-05 12:00:00   20.0                         s21
#> 9   s3/4 2010-08-05 12:00:00   30.3                       s3/41
#> 11    s2 2010-08-05 13:00:00   22.2                         s12
#> 12  s3/4 2010-08-05 13:00:00   29.0                         s22

## I can't even create the full grid with empty values, and then re-subset
## It turns out it fails again, and for the same reason
obs.table <- with(mysparsedata, table(space, time))
obs.idx <- which(obs.table == 0, arr.ind = TRUE)
filldata <- data.frame(
  space = levels(mysparsedata$space)[obs.idx[, "space"]],
  time = sort(unique(mysparsedata$time))[obs.idx[, "time"]],
  values = NA
)

augmenteddata <- rbind(mysparsedata, filldata)

fakestfdf <- stConstruct(augmenteddata, "space", "time", SpP)
mystidf <- as(fakestfdf, "STIDF")
#> Error in from@sp[from@index[, 1], ]: SpatialPolygons selection: can't find plot order if polygons are replicated

## Workaround: create the replicated spatial object manually
## inspired by rbind.SpatialPolygons
## I think there should be a similar makeUniqueIDs argument for `[`
pls <- SpP@polygons[mysparsedata$space]
ids <- sapply(pls, function(i) slot(i, "ID"))
if (any(duplicated(ids))) {
  ids <- make.unique(as.character(unlist(ids)))
  for (i in seq(along = ids)) pls[[i]]@ID = ids[i]
}
SpP.rep <- 
  SpatialPolygons(pls, proj4string = CRS(proj4string(SpP)))

## Now I can use this object either with stConstruct or STIDF
mystidf <- stConstruct(mysparsedata, "space", "time", SpP.rep)
mystidf2 <- STIDF(SpP.rep, mysparsedata$time, mysparsedata)

all.equal(mystidf, mystidf2, check.attributes = FALSE)
#> [1] TRUE
stplot(mystidf[,, "values"])

Conclusions: