This package aims to speed up research projects in social psychology (and related fields). For that, it primarily includes some functions that help lay the groundwork and others that facilitate the reporting of results.
Among others, the package can help with the following:
There are many packages that support data analysis and reporting. For
instance, the psych
package offers functions to create scales, while
the modelsummary
package offers options to create customisable tables
in a wide variety of output format. They power many of the functions
offered here ‘under the hood.’
apa
and papaja
are two packages that directly support the reporting
of results in APA style - they can complement this package well.
However, none of the existing offered quite what we needed. This package
tidyverse
by supporting tidy evaluation and
returning tibbles where possibleYou can install timesaveR from GitHub with the command below. If you do
not have the remotes
-package installed, run
install.packages("remote")
first.
remotes::install_github('lukaswallrich/timesaveR')
There are many functions in the package, and we will create vignettes detailing various use cases. However, the following can give you an initial flavor. The examples use data from the European Social Survey Wave 7 (2014). Here, I ignore survey weights. However, the package offers similar functions for analysing weighted survey data, which are explained in the survey data vignette.
(I also load dplyr
since that is the recommended usage - of course,
there are base R alternatives for all of the steps.)
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(timesaveR)
#> Note re timesaveR: Many functions in this package are alpha-versions - please treat results with care and report bugs and desired features.
Let’s create scales for health behaviours and depressive symptoms, each including some reverse coding.
scales <- list(
depression = c("fltdpr", "flteeff", "slprl", "wrhpp", "fltlnl",
"enjlf", "fltsd", "cldgng"),
healthy_eating = c("etfruit", "eatveg")
)
scales_reverse <- list(
depression = c("wrhpp", "enjlf"),
healthy_eating = c("etfruit", "eatveg")
)
scales <- make_scales(ess_health, items = scales, reversed = scales_reverse)
#> The following scales will be calculated with specified reverse coding: depression, healthy_eating
#Check descriptives, including reliability
scales$descriptives
#> # A tibble: 2 × 10
#> Scale n_items reliability reliability_method mean SD reversed rev_min
#> <chr> <int> <dbl> <chr> <dbl> <dbl> <chr> <dbl>
#> 1 depression 8 0.802 cronbachs_alpha 1.67 0.484 wrhpp e… 1
#> 2 healthy_e… 2 0.658 spearman_brown 4.97 1.11 etfruit… 1
#> # ℹ 2 more variables: rev_max <dbl>, text <chr>
#Add scale scores to dataset
ess_health <- bind_cols(ess_health, scales$scores)
Next, we are often interested in descriptive statistics, variable distributions and correlations.
ess_health %>% select(agea, health, depression, healthy_eating) %>%
cor_matrix() %>% report_cor_table()
Variable | M (SD) | 1 | 2 | 3 |
---|---|---|---|---|
|
50.61 (18.51) |
|||
|
2.26 (0.92) |
.28 *** |
||
|
1.67 (0.48) |
.03 * |
.42 *** |
|
|
4.97 (1.11) |
.17 *** |
-.09 *** |
-.13 *** |
M and SD are used to represent mean and standard deviation, respectively. Values in square brackets indicate the 95% confidence interval for each correlation. | ||||
† p < .1, * p < .05, ** p < .01, *** p < .001 |
var_renames <- tibble::tribble(
~old, ~new,
"agea", "Age",
"health", "Poor health",
"depression", "Depression",
"healthy_eating", "Healthy eating",
)
ess_health %>% cor_matrix(var_names = var_renames) %>% report_cor_table()
Variable | M (SD) | 1 | 2 | 3 |
---|---|---|---|---|
|
50.61 (18.51) |
|||
|
2.26 (0.92) |
.28 *** |
||
|
1.67 (0.48) |
.03 * |
.42 *** |
|
|
4.97 (1.11) |
.17 *** |
-.09 *** |
-.13 *** |
M and SD are used to represent mean and standard deviation, respectively. Values in square brackets indicate the 95% confidence interval for each correlation. | ||||
† p < .1, * p < .05, ** p < .01, *** p < .001 |
ess_health %>% cor_matrix(var_names = var_renames) %>% report_cor_table(add_distributions = TRUE, data = ess_health)
Variable | M (SD) | Distributions | 1 | 2 | 3 |
---|---|---|---|---|---|
|
50.61 (18.51) |
||||
|
2.26 (0.92) |
.28 *** |
|||
|
1.67 (0.48) |
.03 * |
.42 *** |
||
|
4.97 (1.11) |
.17 *** |
-.09 *** |
-.13 *** | |
M and SD are used to represent mean and standard deviation, respectively. Values in square brackets indicate the 95% confidence interval for each correlation. | |||||
† p < .1, * p < .05, ** p < .01, *** p < .001 |
Often, we are also interested in how the means of an outcome variable differ between different groups. It can be fiddly to get these tables and the pairwise significance tests done, but this function does it in a breeze.
# Start with this in the console - that gets you 80% of the tribbles below.
# get_rename_tribbles(ess_health, gndr, cntry)
var_renames <- tribble(
~old, ~new,
"gndr", "Gender",
"cntry", "Country"
)
level_renames <- tribble(
~var, ~level_old, ~level_new,
"gndr", "1", "male",
"gndr", "2", "female",
"cntry", "DE", "Germany",
"cntry", "FR", "France",
"cntry", "GB", "UK"
)
report_cat_vars(ess_health, health, gndr, cntry, var_names = var_renames,
level_names = level_renames)
N | Share | M (SD) | |
---|---|---|---|
Gender | |||
male |
3482 |
48.2% |
2.23 (0.90) a |
female |
3744 |
51.8% |
2.30 (0.93) b |
Country | |||
Germany |
3045 |
42.1% |
2.34 (0.88) a |
France |
1917 |
26.5% |
2.29 (0.89) a |
UK |
2264 |
31.3% |
2.14 (0.97) b |
M and SD are used to represent mean and standard
deviation for health for that group, respectively. |
|||
Within each variable, the means of groups with different superscripts
differ with p \< .05 (p-values were adjusted using the Holm-method.) |
|
|
|
---|---|---|
(Intercept) |
1.20 (0.02)\*\*\* |
-0.14 \[-0.18, -0.10\] |
agea |
-0.00 (0.00)\*\*\* |
-0.10 \[-0.12, -0.08\] |
gndr2 |
0.12 (0.01)\*\*\* |
0.24 \[0.20, 0.28\] |
health |
0.23 (0.01)\*\*\* |
0.44 \[0.42, 0.47\] |
cntryFR |
-0.01 (0.01) |
-0.02 \[-0.08, 0.03\] |
cntryGB |
0.04 (0.01)\*\* |
0.08 \[0.03, 0.13\] |
N | 7171 | |
R2 | .20 | |
F-tests |
F(5, 7165) = 358.25, p \< .001 |
|
Given that dummy variables lose their interpretability when standardized (Fox, 2015), β for dummy variables are semi-standardized, indicating the impact of that dummy on the standardized outcome variable. | ||
† p \< .1, \* p \< .05, \*\* p \< .01, \*\*\* p \< .001 |
|
|
|
---|---|---|
(Intercept) |
1.20 (0.02)\*\*\* |
-0.14 \[-0.18, -0.10\] |
Age |
-0.00 (0.00)\*\*\* |
-0.10 \[-0.12, -0.08\] |
Gender (female) |
0.12 (0.01)\*\*\* |
0.24 \[0.20, 0.28\] |
Poor health |
0.23 (0.01)\*\*\* |
0.44 \[0.42, 0.47\] |
France (vs DE) |
-0.01 (0.01) |
-0.02 \[-0.08, 0.03\] |
UK (vs DE) |
0.04 (0.01)\*\* |
0.08 \[0.03, 0.13\] |
N | 7171 | |
R2 | .20 | |
F-tests |
F(5, 7165) = 358.25, p \< .001 |
|
Given that dummy variables lose their interpretability when standardized (Fox, 2015), β for dummy variables are semi-standardized, indicating the impact of that dummy on the standardized outcome variable. | ||
† p \< .1, \* p \< .05, \*\* p \< .01, \*\*\* p \< .001 |
Model 1 | Model 2 | |||
---|---|---|---|---|
|
|
|
|
|
(Intercept) |
1.21 (0.02)\*\*\* |
-0.14 \[-0.18, -0.10\] |
1.27 (0.02)\*\*\* |
-0.14 \[-0.18, -0.10\] |
Age |
-0.00 (0.00)\*\*\* |
-0.10 \[-0.12, -0.08\] |
-0.00 (0.00)\*\*\* |
-0.14 \[-0.17, -0.10\] |
Gender (female) |
0.11 (0.01)\*\*\* |
0.24 \[0.20, 0.28\] |
0.03 (0.03) |
0.24 \[0.20, 0.28\] |
Poor health |
0.23 (0.01)\*\*\* |
0.44 \[0.42, 0.46\] |
0.23 (0.01)\*\*\* |
0.44 \[0.42, 0.46\] |
France (vs DE) |
-0.01 (0.01) |
-0.03 \[-0.08, 0.02\] |
-0.01 (0.01) |
-0.03 \[-0.08, 0.02\] |
UK (vs DE) |
0.04 (0.01)\*\* |
0.08 \[0.03, 0.13\] |
0.04 (0.01)\*\* |
0.08 \[0.03, 0.13\] |
Education |
|
|
-0.00 (0.00)† |
-0.02 \[-0.04, 0.00\] |
Age x Female |
|
|
0.00 (0.00)\*\* |
0.07 \[0.02, 0.11\] |
N | 6852 | 6852 | ||
R2 | .20 | .20 | ||
F-tests |
F(5, 6846) = 337.06, p \< .001 |
F(7, 6844) = 243.10, p \< .001 |
||
Change | ΔR2 = .00, F(2, 6844) = 6.78, p = .001 | |||
Given that dummy variables lose their interpretability when standardized (Fox, 2015), β for dummy variables are semi-standardized, indicating the impact of that dummy on the standardized outcome variable. | ||||
† p \< .1, \* p \< .05, \*\* p \< .01, \*\*\* p \< .001 |