mark-andrews / psyntur

Tools to help teach data analysis using R to NTU Psychology students
Other
5 stars 2 forks source link

In `total_scores` add option to drop variables that were aggregated #38

Open mark-andrews opened 2 years ago

mark-andrews commented 2 years ago

I think may be a relatively simple manner of deselecting by the selectors.

Mark-Torrance commented 2 years ago

This?

total_scores <- function(.data, ..., .method = 'mean', .append = FALSE, .drop = FALSE){

  totalling_function <- function(.data_selection, .method){
    switch(.method,
           mean = rowMeans(.data_selection, na.rm = T),
           sum = rowSums(.data_selection, na.rm = T),
           sum_like = rowMeans(.data_selection, na.rm = T) * ncol(.data_selection)
    )
  }

  if (!(.method %in% c('mean', 'sum', 'sum_like'))) {
    stop('The function that calculates the total must be either "mean", "sum" or "sum_like".')
  }

  selection_sets <- rlang::enquos(...)

  results_df <- purrr::map_dfc(selection_sets, 
                               function(selection_set){
                                 totalling_function(select(.data, !!selection_set),
                                                    .method)
                               }
  )

  if (.append) {
    Df <- .data
    if (.drop) {
      for(set in selection_sets){
        Df <- select(Df, !all_of(!!set))
      }
    } 

   bind_cols(Df, results_df) 

    } else {
    results_df
    } 
}

This...

Df_all <- read_csv("http://data.ntupsychology.net/psychometrics_demo_data.csv")

head(Df_all %>% total_scores(anxiety = starts_with('anxiety_'),
                             depression = starts_with('depression_'),
                             efficacy = starts_with('efficacy_'),
                             sociability = starts_with('sociability_'),
                             stress = starts_with('stress'),
                             .append = T, .drop = T))

gives...

gender age anxiety depression efficacy sociability stress
1 19 1.8 1.8 2.8 2.2 1.0
1 22 2.4 2.8 2.3 3.3 2.3
2 20 1.7 2.6 3.0 3.5 1.9
2 20 2.2 2.9 3.6 2.3 1.6
2 19 1.7 2.6 3.3 2.8 2.0
2 21 2.0 2.1 2.7 2.6 2.0
mark-andrews commented 2 years ago
mark-andrews commented 2 years ago

As a final point, before I close this, I think that the default should be .append = TRUE, but I am not sure if the default should be .drop = TRUE. I am inclining to .drop = TRUE by default because the total_scores operation is intended for a part of a typical psychometrics-y data analysis workflow whereby sets of items are aggregated over and then only the results of the aggregation are, typically, then used in the remaining workflow. That would entail both .append = TRUE and .drop = TRUE.