Open schroeder-matt opened 2 months ago
Well done with the function! Excellent use of the ::
and parameter documentation. I'm pasting here for quicker reference
Do you think across()
could work for your needs? It is a newer feature in the tidyverse. I've used it in some code here .
If across doesn't work, can you help me understand how this function is works differently?
#' @title Generate code to sum across multiple cells
#'
#' @param .table character, prefix for table
#' @param .start first cell in range to be summed
#' @param .end last cell in range to be summed
#' @param ...
#' @param .int numeric, interval between cells. Default is `1`
#' @param .width numeric, pad cell numbers with 0s to this length. Default is `1`.
#' @param repeatTimes numeric, number of times to repeat the sequence. Default
#' is `1`.
#' @param repeatOffset numeric, jump by this number of cells each time the
#' sequence repeats. Default is `0`.
#'
#' @return
#' @export
#'
#' @examples
#' #### sumRange("B01234e", 2, 6) --> B01234e2 + B01234e3 + B01234e4 + B01234e5 + B01234e6
#' #### sumRange("B01234e", 2, 6, .int=2) --> B01234e2 + B01234e4 + B01234e6
#' #### sumRange("B01234e", 2, 6, .int=2, .width=3) --> B01234e002 + B01234e004 + B01234e006
#' #### sumRange("B01234e", 2, 6, .int=2, .width=3, repeatTimes=2, repeatOffset=10) -->
#' B01234e002 + B01234e004 + B01234e006 + B01234e012 + B01234e014 + B01234e016
#' #### can be used in dplyr::mutate(!!sumRange())
sumRange <- function(.table,
.start,
.end,
...,
.int = 1,
.width = 1,
repeatTimes = 1,
repeatOffset = 0) {
if (repeatTimes == 1) { # if no repetition is needed, just use simple code
a <- paste0(rep(.table),
stringr::str_pad(seq(from = .start,
to = .end,
by = .int), width = .width, pad = "0"),
collapse = " + ") # and this links each table/cell number combination with a "+"
rlang::parse_expr(a)
} else { # otherwise, repeat as many times as requested in argument, putting elements into a list
a <- purrr::map(1:repeatTimes, ~ paste0(rep(.table),
stringr::str_pad(seq(from = .start + (repeatOffset * (.x - 1)),
to = .end + (repeatOffset * (.x - 1)),
by = .int), width = .width, pad = "0"),
collapse = " + ") # and this links each table/cell number combination with a "+"
)
# then we just have to combine the list elements (one per repetition)
rlang::parse_expr(paste0(a, collapse = " + "))
}
}
Sure! across()
is a simpler way to transform/create multiple variables at once; this function is a simpler way to add up multiple variables at once (in order to transform/create a single variable). An example is here.
One of the few things that's easier to do in SAS than in R is summing across ranges of variables by number. In SAS,
sum(of v1-v4)
is interpreted asv1 + v2 + v3 + v4
-- but R has nothing similar that I'm aware of. This is unfortunate because so many Census Bureau datasets have this naming scheme, and many times I need to add up several variables at once.I created the attached function (building on the framework developed by former Research staff Nicole Sullivan), but I store it in multiple repositories. I figured it would be good to have this in
councilR
so that it lives in only one place and potentially help others.sumRange.txt
Is this something that would be a good fit for
councilR
? If so, please feel free to make edits to anything and everything, because I am not an expert in designing functions. I also made this with Census Bureau data in mind, so you may find opportunities to generalize the code for other uses.