FredHutch / VISCtemplates

Tools for writing reproducible reports at VISC
Other
6 stars 2 forks source link

Update git_obj_compare_fun.R #111

Closed valduran18 closed 1 year ago

valduran18 commented 1 year ago

Referencing PR #110

there is a mix of camel case vs period.separated, and I am under the impression that team prefers to use snake_case to better align with tidyverse usage, reference: https://style.tidyverse.org/syntax.html

Code has been updated to snake_case in this PR.

how should we handle data object syntax? can this function handle both "Lusso847_nab" and "nab" (auto populating the package name)? will we need to compare data objects not in a VISC data package?

The function is case-insensitive and can handle any version of names (Lusso847_nab vs nab) as long as a match can be found using grep()

for the stash activity, can we name this so as not to interfere with existing stash? do we pop this stash so as not to introduce a change to the user system?

Stash has been saved as git_stash, and a stash_pop will be applied to drop this recent stash

what is the benefit of using diffdf vs dplyr::all_equal or testthat::expect_equal? is this documented well?

diffdf allows for detailed comparison between two dataframes. One key difference between dplyr::all_equal() and diffdf::diffdf() is that diffdf returns differences between the data. This output might be useful to see detailed information on differences. Would this be something useful in a common scenario? Some options are

  1. We output these differences
  2. We output only if there is a difference/no difference (such as with dplyr::all_equal() and testthat::expect_equal
  3. We create an option where the user can select either or

https://cloud.r-project.org/web/packages/diffdf/vignettes/diffdf-basic.html