r-spatial / spdep

Spatial Dependence: Weighting Schemes and Statistics
https://r-spatial.github.io/spdep/
116 stars 26 forks source link

implement bivariate local join count #97

Closed JosiahParry closed 1 year ago

JosiahParry commented 1 year ago

This pull request is a follow up to #94 to provide an interface to the bivariate local join count. The bivariate local join count covers cases of in-situ (CLC) and no in-situ colocation (BJC).

This PR implements the function local_joincount_bv(). It checks inputs x and z to determine whether the CLC or BJC case is being assessed. Based on the identified case, calculates the appropriate bivariate join count measure.

The example uses the columbus dataset for simplicity sake. If the detroit housing or chicago block classification data for the cited paper is readily available, I am happy to modify.

Please see the below comparison to rgeoda implementations.

devtools::load_all()
#> ℹ Loading spdep
#> Loading required package: sp
#> Loading required package: spData
#> To access larger datasets in this package, install the spDataLarge
#> package with: `install.packages('spDataLarge',
#> repos='https://nowosad.github.io/drat/', type='source')`
#> Loading required package: sf
#> Linking to GEOS 3.9.1, GDAL 3.2.3, PROJ 7.2.1; sf_use_s2() is TRUE
library(sf)
library(rgeoda)
#> Loading required package: digest
#> 
#> Attaching package: 'rgeoda'
#> The following object is masked from 'package:spdep':
#> 
#>     skater

guerry_path <- system.file("extdata", "Guerry.shp", package = "rgeoda")
g <- st_read(guerry_path)
#> Reading layer `Guerry' from data source 
#>   `/Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/rgeoda/extdata/Guerry.shp' 
#>   using driver `ESRI Shapefile'
#> Simple feature collection with 85 features and 29 fields
#> Geometry type: MULTIPOLYGON
#> Dimension:     XY
#> Bounding box:  xmin: 47680 ymin: 1703258 xmax: 1031401 ymax: 2677441
#> Projected CRS: NTF (Paris) / Lambert zone II
queen_w <- queen_weights(g)
g["InvCrm"] <-  1 - g[["TopCrm"]]

listw <- nb2listw(poly2nb(g), style = "B")

# BJC case ----------------------------------------------------------------
res <- local_joincount_bv(g$TopCrm, g$InvCrm, listw, nsim = 9999)
lisa <- local_bijoincount(queen_w, g[c("TopCrm", "InvCrm")], 
                          permutations = 9999)

# check that the JC values are the same
all.equal(res$join_count, lisa$lisa_vals)
#> [1] TRUE

ps <- data.frame(na.omit(cbind(res$p_sim, lisa$p_vals))) |> 
  setNames(c("impl", "rgeoda"))

ps
#>      impl rgeoda
#> 1  0.0918 0.1061
#> 2  0.0106 0.0106
#> 3  0.0127 0.0094
#> 4  0.0313 0.0251
#> 5  0.1405 0.1352
#> 6  0.0352 0.0354
#> 7  0.0314 0.0286
#> 8  0.2894 0.2881
#> 9  0.2122 0.2159
#> 10 0.3696 0.3692
#> 11 0.0390 0.0318
#> 12 0.3971 0.4010
#> 13 0.4934 0.4987
#> 14 0.0947 0.0935
#> 15 0.2938 0.2899
#> 16 0.0021 0.0020
#> 17 0.3848 0.3879
#> 18 0.0326 0.0298
#> 19 0.0969 0.0905
#> 20 0.0782 0.0771
#> 21 0.2938 0.2977
#> 22 0.0819 0.0808
#> 23 0.0004 0.0005
#> 24 0.0787 0.0789
#> 25 0.0017 0.0040

# mean abs difference
mean(abs(ps[,1] - ps[,2]))
#> [1] 0.003204

# CLC case ----------------------------------------------------------------
set.seed(0)
x <- rbinom(85, 1, 0.4)
set.seed(1)
z <- rbinom(85, 1, 0.7)

res <- local_joincount_bv(x, z, listw, nsim = 9999)
lisa <- local_multijoincount(queen_w, data.frame(x, z), permutations = 9999)

# check that join count stats are the same
all.equal(res$join_count, lisa$lisa_vals)
#> [1] TRUE

# compare p values 
ps <- data.frame(na.omit(cbind(res$p_sim, lisa$p_vals))) |> 
  setNames(c("impl", "rgeoda"))

ps
#>      impl rgeoda
#> 1  0.4067 0.4183
#> 2  0.2539 0.2479
#> 3  0.2938 0.2945
#> 4  0.1965 0.1994
#> 5  0.1250 0.1288
#> 6  0.0778 0.0778
#> 7  0.4636 0.4553
#> 8  0.4197 0.4174
#> 9  0.4250 0.4204
#> 10 0.4607 0.4582
#> 11 0.3163 0.3188
#> 12 0.0903 0.0868
#> 13 0.4026 0.4002
#> 14 0.2513 0.2579
#> 15 0.4069 0.4074
#> 16 0.3186 0.3123
#> 17 0.4540 0.4777
#> 18 0.3177 0.3106
#> 19 0.1220 0.1272
#> 20 0.1244 0.1223
#> 21 0.2628 0.2592
#> 22 0.4309 0.4245
#> 23 0.4355 0.4465
#> 24 0.1848 0.1869
#> 25 0.3464 0.3560
#> 26 0.1066 0.1031
#> 27 0.0796 0.0791

# mean abs difference
mean(abs(ps[,1] - ps[,2]))
#> [1] 0.005159259
JosiahParry commented 1 year ago

@rsbivand, this is now ready for review.

JosiahParry commented 1 year ago

Note that the build error is for mac OS relating to the CO69.Rmd vignette file.

Error:   Error(s) in re-building vignettes:
  --- re-building ‘CO69.Rmd’ using rmarkdown