IDEMSInternational / R-Instat

A statistics software package powered by R
http://r-instat.org/
GNU General Public License v3.0
38 stars 103 forks source link

Be able to use the Prepare > Factor > Levels/Labels dialogue for multiple variables #7189

Open rdstern opened 2 years ago

rdstern commented 2 years ago

Label across many variables is a specific item in Bob's review, (no 33) and @lilyclements gives the code for this in her pull request #7127. What isn't clear is how to add this multiple option to the dialogue.

This new version now has Select working and a selection can be these multiple variables. So I propose that selections sometimes be included in the data selector. We may also want the selections to have properties we can use, so here we only have factor variables. Could we check that the selection we include only consist of factors?

There will anyway have to be a check once we use these factors. So I think what we are doing only makes sense if all the variables have the same number of levels?

In this case the first variable will be used to show the grid, and then the new labels will be applied to all the variables.

We may make mistakes, so this will be an excellent item also for our forthcoming undo. It changes the existing variables.

lilyclements commented 2 years ago

From PR #7127, @rdstern commented:

"[...] I discussed with @volloholic and he has nailed it! The solution follows the work @shadrackkibet is doing on Select. We don't need another control, or another option on the Data Options sub-dialogue. We already select subsets of variables and give them a name! In the data selector we also allow those names when appropriate. They are the groups of variables. Beautiful!"

My response: @rdstern - great! The solution in (d) sounds excellent, and I will look into implementing it in R-Instat!

In terms of how I think it will go, I assume we still use the across function, but work across the selected group of columns. For example, if the function were fct_recode then it would be something like this in R:

library(tidyverse)

x1 <- factor(rbinom(n = 6, size = 1, prob = 0.5))
x2 <- factor(rep(0, 6))
x3 <- factor(rbinom(n = 6, size = 1, prob = 0.5))
x4 <- factor(rbinom(n = 6, size = 1, prob = 0.5))
df <- data.frame(x1, x2, x3, x4)

group_cols <- c("x1", "x2", "x3", "x4")

df %>%
  mutate(across(all_of(group_cols) , ~fct_recode(., `yes` = "1", `no` = "0")))

We would just need to apply this to our R functions instead of functions like fct_recode.