tidymodels / spatialsample

Create and summarize spatial resampling objects 🗺
https://spatialsample.tidymodels.org
Other
71 stars 5 forks source link

Add first draft of NNDM function #141

Closed mikemahoney218 closed 1 year ago

mikemahoney218 commented 1 year ago

This PR implements a first draft of a function for nearest-neighbor distance matching LOO-CV, as described in Milà et al. 2022 ( https://besjournals.onlinelibrary.wiley.com/doi/full/10.1111/2041-210X.13851 ) and implemented in CAST ( https://hannameyer.github.io/CAST/reference/nndm.html ).

This function produces identical assessment sets to CAST:

data(ames, package = "modeldata")
ames_sf <- sf::st_as_sf(ames, coords = c("Longitude", "Latitude"), crs = 4326)

compare_methods <- function(data, prop) {
  n <- nrow(data)
  train <- sample.int(n, size = floor(n * prop))

  purrr::map2_lgl(
    CAST::nndm(data[train, ], ppoints = data[-train, ])$indx_train,
    spatialsample::spatial_nndm_cv(data[train, ], data[-train, ])$splits,
    function(x, y) {
      length(setdiff(x, as.integer(y))) == 0
    }
  ) |> 
    all()
}

all(replicate(25, compare_methods(ames_sf, 0.75)))
#> [1] TRUE

My intention here (long-term) is to move the loop into C++, in order to speed up execution time; that said, unless your model is extremely fast, the speed of building resamples is probably not super relevant compared to the speed of fitting models.

mikemahoney218 commented 1 year ago

@hfrick , any chance you'd be willing to take a look at this one? I always appreciate another set of eyes when adding large new features like this :smile:

mikemahoney218 commented 1 year ago

Thank you so much @hfrick , this is exactly what I was hoping for :smile:

mikemahoney218 commented 1 year ago

I think I've addressed most of your comments! Let me know if I missed anything. Thank you so much for taking a look at this -- I definitely was bashing my ahead against the wall with this one, and I think some of that in-the-moment confusion unfortunately is reflected in the code :sweat_smile:

github-actions[bot] commented 1 year ago

This pull request has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.