In `total_scores` add option to drop variables that were aggregated

mark-andrews commented 2 years ago

I think may be a relatively simple manner of deselecting by the selectors.

Mark-Torrance commented 2 years ago

This?

total_scores <- function(.data, ..., .method = 'mean', .append = FALSE, .drop = FALSE){

  totalling_function <- function(.data_selection, .method){
    switch(.method,
           mean = rowMeans(.data_selection, na.rm = T),
           sum = rowSums(.data_selection, na.rm = T),
           sum_like = rowMeans(.data_selection, na.rm = T) * ncol(.data_selection)
    )
  }

  if (!(.method %in% c('mean', 'sum', 'sum_like'))) {
    stop('The function that calculates the total must be either "mean", "sum" or "sum_like".')
  }

  selection_sets <- rlang::enquos(...)

  results_df <- purrr::map_dfc(selection_sets, 
                               function(selection_set){
                                 totalling_function(select(.data, !!selection_set),
                                                    .method)
                               }
  )

  if (.append) {
    Df <- .data
    if (.drop) {
      for(set in selection_sets){
        Df <- select(Df, !all_of(!!set))
      }
    } 

   bind_cols(Df, results_df) 

    } else {
    results_df
    } 
}

This...

Df_all <- read_csv("http://data.ntupsychology.net/psychometrics_demo_data.csv")

head(Df_all %>% total_scores(anxiety = starts_with('anxiety_'),
                             depression = starts_with('depression_'),
                             efficacy = starts_with('efficacy_'),
                             sociability = starts_with('sociability_'),
                             stress = starts_with('stress'),
                             .append = T, .drop = T))

gives...

gender	age	anxiety	depression	efficacy	sociability	stress
1	19	1.8	1.8	2.8	2.2	1.0
1	22	2.4	2.8	2.3	3.3	2.3
2	20	1.7	2.6	3.0	3.5	1.9
2	20	2.2	2.9	3.6	2.3	1.6
2	19	1.7	2.6	3.3	2.8	2.0
2	21	2.0	2.1	2.7	2.6	2.0

mark-andrews commented 2 years ago

I implemented a variant of this.
Having consulted Holy Writ, it is written Thou shalt not use for loops any more in R, so I used purrr::reduce to replace the for loop where the selected sets of variables are dropped.
Added tests, as one should
Updated documentation
Pushed to development branch. Keep main stable and inline with version on CRAN. Everything on development will be merged or rebased into main eventually, but no great rush. Merging will happen at next CRAN update.
See help pages with live example (last example) here: https://mark-andrews.github.io/psyntur/reference/total_scores.html
See code here: https://github.com/mark-andrews/psyntur/blob/2e8da4849d3ac2b1d4b91a5f6aa1b1009360f031/R/psychometrics.R#L76-L113
See commit here: https://github.com/mark-andrews/psyntur/commit/2e8da4849d3ac2b1d4b91a5f6aa1b1009360f031

mark-andrews commented 2 years ago

As a final point, before I close this, I think that the default should be .append = TRUE, but I am not sure if the default should be .drop = TRUE. I am inclining to .drop = TRUE by default because the total_scores operation is intended for a part of a typical psychometrics-y data analysis workflow whereby sets of items are aggregated over and then only the results of the aggregation are, typically, then used in the remaining workflow. That would entail both .append = TRUE and .drop = TRUE.

mark-andrews / psyntur

In `total_scores` add option to drop variables that were aggregated #38

gender	age	anxiety	depression	efficacy	sociability	stress
1	19	1.8	1.8	2.8	2.2	1.0
1	22	2.4	2.8	2.3	3.3	2.3
2	20	1.7	2.6	3.0	3.5	1.9
2	20	2.2	2.9	3.6	2.3	1.6
2	19	1.7	2.6	3.3	2.8	2.0
2	21	2.0	2.1	2.7	2.6	2.0

gender	age	anxiety	depression	efficacy	sociability	stress
1	19	1.8	1.8	2.8	2.2	1.0
1	22	2.4	2.8	2.3	3.3	2.3
2	20	1.7	2.6	3.0	3.5	1.9
2	20	2.2	2.9	3.6	2.3	1.6
2	19	1.7	2.6	3.3	2.8	2.0
2	21	2.0	2.1	2.7	2.6	2.0

gender	age	anxiety	depression	efficacy	sociability	stress
1	19	1.8	1.8	2.8	2.2	1.0
1	22	2.4	2.8	2.3	3.3	2.3
2	20	1.7	2.6	3.0	3.5	1.9
2	20	2.2	2.9	3.6	2.3	1.6
2	19	1.7	2.6	3.3	2.8	2.0
2	21	2.0	2.1	2.7	2.6	2.0