NIEHS / chopin

Scalable GIS methods for environmental and climate data analysis
https://niehs.github.io/chopin/
Other
6 stars 2 forks source link

Self-fix function for distance calculation parallelization #48

Open sigmafelix opened 5 months ago

sigmafelix commented 5 months ago

Distance calculation parallelization with smaller spatial extents than the entire dataset's extent may result in erroneous values if some grids/sub-regions have no target data features or edge cases are present near the boundary of adjacent grids/sub-regions. Gradually expanding grids can be used to fix such edge cases. One challenge is to design a function which determine whether the current calculation is shorter or longer than the actual shortest distance to the nearest feature that would have been found at the full dataset.

Problem statement

Given a grid $G_i$, a point or line target feature set $V$, and a point origin feature set $U$, we want to find $\text{if }\sup {d((U_k \cap G_i ), (V_l \cap G_i))} < \sup {d((U_k \cap G_i ), V)}$, or $\text{if }\inf {d((U_k \cap G_i ), (V_l \cap G_i))} > \sup {d((U_k \cap G_i ), V)}$ $\text{ } \forall k, l$ $\inf$ problem is relevant as we consider calculating the shortest distance to the target feature set.

Hypothesis

sigmafelix commented 1 month ago

A function in the next version will--

library(terra)
library(sf)
library(alphahull)

nc <- vect(system.file("gpkg/nc.gpkg", package = "sf"))
ncp <- spatSample(ncp, 3000)
ncpp <- crds(ncp)
ncp_ahull <- ahull(ncpp[,1], ncpp[,2], 1)

# get the outermost point row indices
ncp_ahull$ashape.obj$alpha.extremes