Closed fbled closed 1 year ago
I am proposing the following correction to sf.kde which implements the initial proposition above, and also corrects the computation of the unweighted automatic bandwidth matrix when no weights nor bandwidth arguments are present. The current version of the code uses incorrectly ks::hpi to do so. This produces an error, that is ignored because hpi does not handle dimensions >1 (leading to a missing bw, but it does not matter because as mentioned in the previous post, ks::kde (silently!) does not recognize the argument h = bw for dimensions>1, and (silently!) applies Hpi to the dataset.
Proposed solution is then:
sf.kde <- function (x, y = NULL, bw = NULL, ref = NULL, res = NULL, standardize = FALSE,
scale.factor = NULL, mask = FALSE)
{
if (missing(x))
stop("x argument must be provided")
ref.flag = inherits(ref, "SpatRaster")
if (!inherits(x, c("sf", "sfc")))
stop(deparse(substitute(x)), " must be a sf, or sfc object")
if (unique(as.character(sf::st_geometry_type(x))) != "POINT")
stop(deparse(substitute(x)), " must be single-part POINT geometry")
if (is.null(ref)) {
if (!is.null(res)) {
ref <- terra::rast(terra::ext(x), resolution = res)
}
else {
ref <- terra::rast(terra::ext(x))
message("defaulting to ", terra::res(ref)[1], "x",
terra::res(ref)[2], " cell resolution")
}
}
if (inherits(ref, "numeric")) {
if (length(ref) != 4)
stop("Need xmin, xmax, ymin, ymax bounding coordinates")
if (!is.null(res)) {
ref <- terra::rast(terra::ext(ref), resolution = res)
}
else {
ref <- terra::rast(terra::ext(ref))
message("defaulting to ", terra::res(ref)[1], "x",
terra::res(ref)[2], " cell resolution")
}
}
else {
if (!inherits(ref, "SpatRaster"))
stop(deparse(substitute(ref)), " must be a terra SpatRast object")
}
n <- c(terra::nrow(ref), terra::ncol(ref))
if (is.null(bw)) {
if (is.null(y)) {
bw = suppressWarnings(ks::Hpi(sf::st_coordinates(x)[,
1:2]))
message("Unweighted automatic bandwidth: ")
print(bw)
}
else {
bw = suppressWarnings(ks::Hscv.diag(cbind(sf::st_coordinates(x)[,
1:2], y)))
message("Weighted automatic CV bandwidth: ")
print(bw)
}
}
else {
message("Using specified bandwidth: ")
print(bw)
if(is.atomic(bw) && length(bw) == 1L) { bw=diag(bw,2) }
}
if (!is.null(y)) {
message("\n", "calculating weighted kde", "\n")
kde.est <- suppressWarnings(terra::rast(matrix(ks::kde(sf::st_coordinates(x)[,
1:2], H = bw, eval.points = terra::xyFromCell(ref,
1:terra::ncell(ref)), gridsize = n, w = y, density = TRUE)$estimate,
nrow = n[1], ncol = n[2], byrow = TRUE), extent = terra::ext(ref)))
}
else {
message("\n", "calculating unweighted kde", "\n")
kde.est <- suppressWarnings(terra::rast(matrix(ks::kde(sf::st_coordinates(x)[,
1:2], H = bw, eval.points = terra::xyFromCell(ref,
1:terra::ncell(ref)), gridsize = n, density = TRUE)$estimate,
nrow = n[1], ncol = n[2], byrow = TRUE), extent = terra::ext(ref)))
}
if (!is.null(scale.factor))
kde.est <- kde.est * scale.factor
if (standardize == TRUE) {
kde.est <- kde.est/terra::global(kde.est, "max", na.rm = TRUE)[,
1]
}
if (mask) {
if (!ref.flag) {
message("Since a raster was not used as ref, there is nothing to mask")
}
else {
kde.est <- terra::mask(kde.est, ref)
}
}
terra::crs(kde.est) <- terra::crs(x)
return(kde.est)
}
I reverted the function to a Gaussian estimate. The multivariate estimate implemented in ks is questionable with spatial data. The bandwidth bug is also fixed.
Hello Jeffrey!
First of all, thanks a lot for the amazing work you've put in Spatial Eco, this is an awesome package, and I love using it! I have recently had to use it to compute some Gaussian Kernel Density estimate for point spatial data with
sf.kde
, and I have noticed an issue in the way it callsks::kde
for the actual computation. As of now, because of the waysf.kde
callsks::kde
,ks::kde
completely ignores the bandwidth argument. Currently,sf.kde
callsks::kde
this way (formated and commented for clarity):It doesn't really matter whether or not we're using weighted or unweighted Gaussian Kernel Density estimates, the issue is the same. I believe it comes from the fact that the call specifies the bandwidth using the
h = bw
argument. However,ks::kde
only usesh
for one-dimensional data. When dealing with 2+ dimensional data, it relies on the argumentH
(a bandwidth matrix), and ignores anyh
argument. IfH
is not specified,Hpi
is called by default.This issue could be resolved in a quick and dirty way by changing the
sf.kde
code with something along the lines of:or using any more appropriate way to compute (or specify) the 2-dimensional bandwidth matrix based on the method of interest.
It would be interesting to either correct the code to force the user to input a 2-dimensional bandwidth matrix in
sf.kde
that is then directly fed inks::kde
using the appropriate argument call (H
), or havesf.kde
create said 2-dimensional bandwidth matrix from the user-specifiedbw
argument.Again, thanks for looking into it, and for the great work you put in this package!