r-spatial / spdep

Spatial Dependence: Weighting Schemes and Statistics
https://r-spatial.github.io/spdep/
116 stars 26 forks source link

Time to compute correlogram from dnearneigh neighbour object #122

Closed CRyan1 closed 1 year ago

CRyan1 commented 1 year ago

Hi, thanks for this package. I'm relatively new to R and spatial statistics. I'm wondering how long it should take to create a sp.correlogram from a dnearneigh neighbour object with an upper bound of 1.4m, and the following parameters:

Neighbour list object: Number of regions: 10735 Number of nonzero links: 42372 Percentage nonzero weights: 0.03676841 Average number of links: 3.947089

This represents a raster grid of 540 m2, with a cell size of 1m2. Input is as a data matrix for coordinates and the variable of interest. The coordinates are projected, in metres.

I've tried this several times and stopped it after an hour each time, to check to see if I've done something wrong. I have 130 grids, most 10,000 m2, so if there is a better way to go about creating correlograms for them, please let me know. Thanks

rsbivand commented 1 year ago

Please show your workflow in code, with details of the objects, critically their CRS. Verbal descriptions are insufficient, a reproducible examble using built-in data would be best.

rsbivand commented 1 year ago

Also, correlograms are not a very good idea anyway, since the underlying derivation of the measures uses the whole graph.

CRyan1 commented 1 year ago

Thanks for getting back. Here's the code and data. The CRS is NZTM2000

library(spdep)

dat2 <-read_csv("dat2.csv") dat2.1<-read_csv("dat2.1.csv")

my.knn <- knearneigh(dat2.1, k=3) my.nb <- knn2nb(my.knn) plot(my.nb,dat2.1) summary(my.nb) dists<-unlist(nbdists(my.nb, dat2.1, longlat = FALSE)) summary(dists) # this dat2.1.csv dat2.csv shows the max dist is 2 (for use in dnearneigh upper threshold (not 1.4 as prev))

my.nb1<-dnearneigh(as.matrix(dat2.1),0,2) summary(my.nb1)

dat2.2 <- as.vector(dat2$Vegetation_cover) my.correlog<- sp.correlogram(nb1, dat2.2, order =10, method ="I", zero.policy = TRUE) print(my.correlog,p.adj.method ="holm") plot(my.correlog)

alt.1: with package pgirmess (much quicker, a few seconds)

library(pgirmess)

my.correlog_pg<- correlog(dat2.1, dat2.2, method="Moran", nbclass = NULL) plot(my.correlog_pg)

alt.2: with package ncf. Whilst this computes quickly, the relationship is unlikely (negative, straight line)??.

library("ncf")

my.correlog_ncf <-correlog( dat2.1$X_Coord_pt, dat2.1$Y_Coord_pt, dat2.2, w = NULL, increment =34, resamp = 0, latlon = FALSE, na.rm = FALSE, quiet = FALSE )

plot (my.correlog_ncf)

rsbivand commented 1 year ago

Thanks for the reprex. Please permit me get back to you in a day's time, travelling. Maybe compare with similar functionality in https://cran.r-project.org/package=ncf and add any code to the reprex, ncf may be better with distance measures.

rsbivand commented 1 year ago

Another alternative might be pgirmess https://cran.r-project.org/package=pgirmess.

CRyan1 commented 1 year ago

Thanks, have updated the code with comments.

rsbivand commented 1 year ago

There are multiple problems in your workflow. For some reason you had very many more observations than 540 in the neighbour objects. So, with:

library(spdep)
dat2 <-read.csv("dat2.csv")
library(sf)
dat.2_sf <- st_as_sf(dat2, coords=c("X_Coord_pt", "Y_Coord_pt"))
dat.2_sf
my.knn <- knearneigh(dat.2_sf, k=1)
my.nb <- knn2nb(my.knn)
plot(my.nb, st_geometry(dat.2_sf))

image

dists <- unlist(nbdists(my.nb, dat.2_sf))
summary(dists)

The distances between the oblique grid points are all 1m.

my.nb1 <- dnearneigh(dat.2_sf, 0, 1)
my.nb1

However, before moving to your attempt to create a correlogram, I checked whether the variable you were examining was suitable. It is not, this is a categorical variable. It is possible to use a join count test (not a correlogram), but with the configuration of the data (the whole area split into two blocks), this would be inappropriate.

table(dat.2_sf$Vegetation_cover)
plot(dat.2_sf["Vegetation_cover"])

image

So I'm very unsure what you are trying to do, and this case does not raise any problems for sp.correlogram().

set.seed(1)
o <- sp.correlogram(my.nb1, rnorm(nrow(dat.2_sf)), order=10, method="I")
plot(o)

image

CRyan1 commented 1 year ago

Ok, thank you. Makes sense.