Consider adding jitter option to knearneigh

r-spatial / spdep

Spatial Dependence: Weighting Schemes and Statistics

116 stars 26 forks source link

library(sf) library(spdep) st_knn <- function(geometry, k = 1, symmetric = FALSE, ...) { ks <- spdep::knearneigh(geometry, k = k, ...) nb <- spdep::knn2nb(ks, sym = symmetric) nb } houses <- readr::read_csv("https://raw.githubusercontent.com/xj-liu/spatial_feature_incorporation/main/houses1990.csv") |> st_as_sf(coords = c("longitude", "latitude"), crs = 4326) locs <- st_geometry(houses) # notice duplicate entries and warning regarding rbind issue head(st_knn(locs, 10)) # here we apply very small jittering & now # there are no warnings & we have similar answers st_jitter(locs, 0.001) |> st_knn(10) |> head()

Please see: https://github.com/r-spatial/spdep/commit/a9e435bc8e24c7f293ae22dbef39832d80fb05df and https://github.com/r-spatial/spdep/commit/22f6f5f97a093525da1cfa1ca7e4e1c93321ba79 , try installing the development version:

> head(st_knn(locs, 10))
Error in spdep::knearneigh(geometry, k = k, ...) : 
  increase k; k must be at least as large as the largest count of identical points

The problem was that rbind came from s2 only including k+1 which may miss the i-th ID, which then needs to be deleted and rbind failed. The locations are typically apartments with the same front-door. It seems generally better to increase k to include as neighbours all observations at that point. A jitter means that arbitrary and random (set.seedis needed) points become neighbours even when they are 2D identical; further, the jitter value would have to be given in the units of the coordinates (1 foot, 0.3 m, 0.0001 degrees??). Jitter is feasible, but not a good idea. 3D knn is possible, but I think not with s2.

r-spatial / spdep

Consider adding jitter option to knearneigh #152