ropensci / software-review

rOpenSci Software Peer Review.
295 stars 104 forks source link

waywiser: Ergonomic Methods for Assessing Spatial Models #571

Closed mikemahoney218 closed 1 year ago

mikemahoney218 commented 1 year ago

Date accepted: 2023-02-27 Submitting Author Name: Mike Mahoney Submitting Author Github Handle: !--author1-->@mikemahoney218<!--end-author1-- Repository: https://github.com/mikemahoney218/waywiser Version submitted: Submission type: Stats Badge grade: silver Editor: !--editor-->@Paula-Moraga<!--end-editor-- Reviewers: @becarioprecario, @jakub_nowosad, @nowosad

Due date for @becarioprecario: 2023-02-04 Due date for @jakub_nowosad: 2023-02-06 Due date for @nowosad: 2023-02-06

Archive: TBD Version accepted: TBD Language: en

Type: Package
Package: waywiser
Title: Ergonomic Methods for Assessing Spatial Models
Version: 0.2.0.9000
Authors@R: c(
    person("Michael", "Mahoney", , "mike.mahoney.218@gmail.com", role = c("aut", "cre"),
           comment = c(ORCID = "0000-0003-2402-304X")),
    person("Lucas", "Johnson", , "lucas.k.johnson03@gmail.com", role = c("ctb"),
           comment = c(ORCID = "0000-0002-7953-0260")),
    person("RStudio", role = c("cph", "fnd"))
  )
Description: Assessing predictive models of spatial data can be challenging, 
    both because these models are typically built for extrapolating outside the
    original region represented by training data and due to potential spatially
    structured errors, with "hot spots" of higher than expected error
    clustered geographically due to spatial structure in the underlying
    data. Methods are provided for assessing models fit to spatial data, 
    including approaches for measuring the spatial structure of model errors,
    assessing model predictions at multiple spatial scales, and evaluating where 
    predictions can be made safely. Methods are particularly useful for models 
    fit using the 'tidymodels' framework. Methods include Moran's I
    ('Moran' (1950) <doi:10.2307/2332142>), Geary's C 
    ('Geary' (1954) <doi:10.2307/2986645>), Getis-Ord's G
    ('Ord' and 'Getis' (1995) <doi:10.1111/j.1538-4632.1995.tb00912.x>),
    agreement coefficients from 'Ji' and Gallo (2006) 
    (<doi: 10.14358/PERS.72.7.823>), agreement metrics from 'Willmott' (1981)
    (<doi: 10.1080/02723646.1981.10642213>) and 'Willmott' 'et' 'al'. (2012)
    (<doi: 10.1002/joc.2419>), an implementation of the area of applicability 
    methodology from 'Meyer' and 'Pebesma' (2021) 
    (<doi:10.1111/2041-210X.13650>), and an implementation of
    multi-scale assessment as described in 'Riemann' 'et' 'al'. (2010)
    (<doi:10.1016/j.rse.2010.05.010>).
License: MIT + file LICENSE
URL: https://github.com/mikemahoney218/waywiser,
    https://mikemahoney218.github.io/waywiser/
BugReports: https://github.com/mikemahoney218/waywiser/issues
Depends: 
    R (>= 3.6)
Imports: 
    dplyr,
    fields,
    FNN,
    glue,
    hardhat,
    Matrix,
    purrr,
    rlang,
    rsample,
    sf (>= 1.0-0),
    spdep (>= 1.1-9),
    stats,
    tibble,
    tidyselect,
    yardstick
Suggests: 
    applicable,
    caret,
    CAST,
    covr,
    ggplot2,
    knitr,
    modeldata,
    recipes,
    rmarkdown,
    spatialsample,
    spelling,
    testthat (>= 3.0.0),
    tidymodels,
    tidyr,
    tigris,
    vip,
    whisker,
    withr
Config/testthat/edition: 3
Config/testthat/parallel: true
Encoding: UTF-8
LazyData: true
Roxygen: list(markdown = TRUE, roclets = c("namespace", "rd", "srr::srr_stats_roclet"))
RoxygenNote: 7.2.3
Language: en-US
VignetteBuilder: knitr

Scope

Pre-submission Inquiry

General Information

Anyone fitting models to spatial data, particularly (but not exclusively) people working within the tidymodels ecosystem. This includes a number of domains, and we've already been using it in our modeling practice.

The waywiser R package makes it easier to measure the performance of models fit to 2D spatial data by implementing a number of well-established assessment methods in a consistent, ergonomic toolbox; features include new yardstick metrics for measuring agreement and spatial autocorrelation, functions to assess model predictions across multiple scales, and methods to calculate the area of applicability of a model.

Relevant software implementing similar algorithms include CAST for ww_area_of_applicability(). Several yardstick metrics implemented directly wrap spdep in a more consistent interface. Willmott's D is also implemented in hydroGOF. Other functions have (as far as I am aware) not been implemented elsewhere, such as ww_multi_scale() which implements the procedure from Riemann et al 2010, or ww_agreement_coefficient() which implements metrics from Ji and Gallo 2006.

N/A

Badging

Silver

Have a demonstrated generality of usage beyond one single envisioned use case. Software is frequently developed for one particular use case envisioned by the authors themselves. Generalising the utility of software so that it is readily applicable to other use cases, and satisfactorily documenting such generality of usage, represents another aspect which may be considered sufficient for software to attain a silver grade.

This is the primary aspect which I believe merits the silver status. The waywiser package implements routines which are useful for a wide variety of spatial models and integrates well with the tidymodels ecosystem, making it (hopefully!) of interdisciplinary interest.

Depending on what the editors think, I'd also potentially submit this for gold, based upon the following two aspects:

Compliance with a good number of standards beyond those identified as minimally necessary. This will require reviewers and authors to agree on identification of both a minimal subset of necessary standards, and a full set of potentially applicable standards. This aspect may be considered fulfilled if at least one quarter of the additional potentially applicable standards have been met, and should definitely be considered fulfilled if more than one half have been met.

Internal aspects of package structure and design. Many aspects of the internal structure and design of software are too variable to be effectively addressed by standards. Packages which are judged by reviewers to reflect notably excellent design choices, especially in the implementation of core statistical algorithms, may also be considered worthy of a silver grade.

But I'm not familiar enough with the system to know if waywiser is likely to be in compliance with these two aspects, and am comfortable submitting for "silver" status if waywiser does not obviously meet both.

Technical checks

Confirm each of the following by checking the box.

This package:

Publication options

Code of conduct

mikemahoney218 commented 1 year ago

@Nowosad I think that would be in scope for waywiser. I'm not planning on implementing it in the near future (I need to focus on my dissertation :sweat_smile:, so all the techniques I'm actively adding are things I'm going to use myself), but I could definitely see the package growing in that direction over time.

ropensci-review-bot commented 1 year ago

:calendar: @jakub_nowosad you have 2 days left before the due date for your review (2023-02-06).

ropensci-review-bot commented 1 year ago

:calendar: @nowosad you have 2 days left before the due date for your review (2023-02-06).

becarioprecario commented 1 year ago

@mikemahoney218https://github.com/mikemahoney218 thanks for taking into account my comments. I hope that they have been useful.

I think these comments reflect that a few functions (namely, the p-value functions) are maybe a bit out of scope for the package, but I'm interested in what you (and others) think.

My opinion is that if a functions is not going to be used by the user it is best to keep it hidden. You can always use ::: if needed (for some reason…). So, the p-value functions is called from other functions in the package but I do not think that it is meant to be used directly by the user.

The p-value functions are included as "model assessment" tools because I've seen modeling projects use p-values to ID areas of concern, with regards to autocorrelation: locations with more extreme p-values for local autocorrelation metrics were selected for further investigation, to see if model specifications could be improved. In that sense, p-values are included as an assessment metric for predictive modeling, and not so much for statistical testing purposes. As such, waywiser lets you return p-values without also returning test statistics themselves, as this approach doesn't really require looking at the underlying test statistic values; extreme p-values are areas of interest, no matter what their actual statistic is.

I see. But p-values are tied to the stats statistics so I am not sure it is a good idea to only report p-values

Given all this, I see two desirable ways to address this set of comments:

  1. Remove p-value functions from the package. This removes a use-case for the package (looking for extreme p-values to identify areas which might help improve model specifications), but also makes it more clear that the package is designed for assessing predictive accuracy, and is not oriented towards inference.
  2. Retain p-value functions, but add documentation clarifying that for inference users are recommended to use spdep equivalents directly.

I would go for (2) and compute statistics and associated p-values from a main function. Then the user can decide on what to plot in a map.

mikemahoney218 commented 1 year ago

@becarioprecario Thank you -- your comments have been extremely useful :smile:

My thinking (and experience) is that the p-value functions are called directly by users as a model diagnostic tool during the iteration process -- these p-values aren't being reported in a publication, but rather used to guide model development by highlighting hot-spots for model residuals (and hopefully helping to make a model misspecification clear, so it can be fixed before any publication).

There's not really a great way for functions using the yardstick infrastructure to return two different statistics (so here, the test statistic and p-value). The idiomatic way to do so is to use yardstick::metric_set() to combine two functions (here, the test statistic and p-value functions), but that's something that's best left to the user, as metric sets can't be "expanded" to include additional metrics.

For example, if you want to calculate (for instance) global Moran's I with a p-value, plus an agreement coefficient, you can run metrics <- yardstick::metric_set(ww_global_moran_i, ww_global_moran_p_value, ww_agreement_coefficient) and then use the metrics() function with your data to get all three outputs. If waywiser provided a metric set (which is what functions like ww_global_moran() did, but note those functions were never in the submitted version of the package) then you couldn't call yardstick::metric_set(ww_global_moran, ww_agreement_coefficient); you'd get an error.

That's why the "combined" functions were removed before this package was submitted; they don't work in a lot of places that users would expect them to be useful, and explaining the reason they work in a very different way than the rest of the package is pretty hard to communicate. Instead, all of the metrics provided by this package are pure yardstick metrics, without any of the weird edge cases. That means they're restricted to each returning a single type of statistic.

I've added documentation as described in (2) to these functions (see for instance https://github.com/mikemahoney218/waywiser/commit/333cf42cec3f6bddcd7f8b54150bc1f5dd8e365f#diff-45d2e91a37be2289564b4e1c987cbc8ac817ee874cc0ddcf19cfcdd8088c01feR6-R9 ). I could also add documentation about using yardstick::metric_set() to calculate both the test statistic and p-value at once, though I personally think it'd be better to not mention that; instead, the current documentation encourages people to use the spdep functions directly if they're looking to use p-values for other purposes than what I've described. This documentation (on using metric_set()) would probably look like the section on metric_set() that's in the Getting Started vignette.

Alternatively, I could remove the p-value functions, if you think that having them at all without a combination function is harmful. But I don't think it makes sense to add combination functions; they introduce too many weird edge cases and don't idiomatically fit into yardstick.

Paula-Moraga commented 1 year ago

Many thanks @becarioprecario and @Nowosad for your useful reviews, and @mikemahoney218 for taking into account all the comments and suggestions to improve the package.

I would like to ask @becarioprecario and @Nowosad if you are happy with the new version of the package and the package can be approved or you have additional comments.

Nowosad commented 1 year ago

@Paula-Moraga I am happy the current version of the package.

mikemahoney218 commented 1 year ago

Hi @becarioprecario @Paula-Moraga , I just wanted to bump this thread: are there additional comments still to be resolved? Thank you!

becarioprecario commented 1 year ago

@mikemahoney218https://github.com/mikemahoney218 sorry for missing this. I am still concerned about not reporting together the test statistic and the p-value. Really, from a statistical point of view there is no point in reporting only one of them when making inference. You can have tiny p-values with really meaningless effects. I agree that not shooting yourself in the foot is most likely on the users’ side but still…

In any case, these two values can be extracted with the functions in the package, which is fine, I think. If you consider that this discussion is helpful you can include it somewhere in the documentation and/or vignettes.

mikemahoney218 commented 1 year ago

@becarioprecario Not a problem -- thank you for taking the time to review the package! It's highly appreciated. I'll add a new section to the vignettes tomorrow and follow up here with the commit.

mikemahoney218 commented 1 year ago

@becarioprecario I added a summary of this discussion to the "residual autocorrelation" vignette: https://github.com/mikemahoney218/waywiser/commit/c0860a1b974b9b0e2b5d55b990657d8a7706fb28

Thank you again for your time reviewing this package, it's been a real help.

becarioprecario commented 1 year ago

@mikemahoney218https://github.com/mikemahoney218 many thanks for adding that note. I think that I have no more comments about the package. Thanks for contributing your package to the community!!

Paula-Moraga commented 1 year ago

Many thanks @mikemahoney218 @becarioprecario @Nowosad for your time and work to improve the package. I am very pleased to approve it!

Paula-Moraga commented 1 year ago

@ropensci-review-bot approve waywiser

ropensci-review-bot commented 1 year ago

Approved! Thanks @mikemahoney218 for submitting and @becarioprecario, @jakub_nowosad, @nowosad for your reviews! :grin:

To-dos:

Should you want to acknowledge your reviewers in your package DESCRIPTION, you can do so by making them "rev"-type contributors in the Authors@R field (with their consent).

Welcome aboard! We'd love to host a post about your package - either a short introduction to it with an example for a technical audience or a longer post with some narrative about its development or something you learned, and an example of its use for a broader readership. If you are interested, consult the blog guide, and tag @ropensci/blog-editors in your reply. They will get in touch about timing and can answer any questions.

We maintain an online book with our best practice and tips, this chapter starts the 3d section that's about guidance for after onboarding (with advice on releases, package marketing, GitHub grooming); the guide also feature CRAN gotchas. Please tell us what could be improved.

Last but not least, you can volunteer as a reviewer via filling a short form.

mikemahoney218 commented 1 year ago

@ropensci-review-bot invite me to ropensci/waywiser

ropensci-review-bot commented 1 year ago

Invitation sent!

mikemahoney218 commented 1 year ago

@ropensci-review-bot finalize transfer of waywiser

ropensci-review-bot commented 1 year ago

Transfer completed. The waywiser team is now owner of the repository and the author has been invited to the team