New argument `data_fun` to `gglikert()`

sda030 commented 6 months ago

This is not a huge priority for me or my colleagues, but one might want to sort not based on the proportion of categories above (or including the middle), but simply the top category. Maybe something for 0.7 or 0.8? :)

larmarange commented 6 months ago

I'm not sure to see the case usage.

Somehow, gglikert() cannot cover all situations, considering that for advance usage, it is possible to use directly position_likert().

Advance users can use gglikert_data() to generate a dataset or produce themselves an appropriate dataset, order factors as they need (e.g. with forcats::fct_reorder()) and finally generate a plot according to their specific needs.

library(ggstats)
library(dplyr)
#> 
#> Attachement du package : 'dplyr'
#> Les objets suivants sont masqués depuis 'package:stats':
#> 
#>     filter, lag
#> Les objets suivants sont masqués depuis 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(ggplot2)

likert_levels <- c(
  "Strongly disagree",
  "Disagree",
  "Neither agree nor disagree",
  "Agree",
  "Strongly agree"
)
set.seed(42)
df <-
  tibble(
    q1 = sample(likert_levels, 150, replace = TRUE),
    q2 = sample(likert_levels, 150, replace = TRUE, prob = 5:1),
    q3 = sample(likert_levels, 150, replace = TRUE, prob = 1:5),
    q4 = sample(likert_levels, 150, replace = TRUE, prob = 1:5),
    q5 = sample(c(likert_levels, NA), 150, replace = TRUE),
    q6 = sample(likert_levels, 150, replace = TRUE, prob = c(1, 0, 1, 1, 0))
  ) %>%
  mutate(across(everything(), ~ factor(.x, levels = likert_levels)))

tmp <- df %>% gglikert_data(include = q1:q6)

# custom sorting my data (i.e. the levels of factors to be plotted)

ggplot(tmp) +
  aes(y = .question, fill = .answer, label = scales::percent(after_stat(prop), accuracy = 1)) +
  geom_bar(position = "likert", stat = "prop") +
  geom_text(position = position_likert(.5), stat = "prop")

^{Created on 2024-03-29 with reprex v2.1.0}

larmarange commented 6 months ago

Eventually, we could consider an additional argument allowing to pass a custom function to be applied on the data before ploting.

sda030 commented 6 months ago

Hi,yes, I see your point. I like the idea of a custom function in the future, or make local wrapper functions if needed.

larmarange commented 6 months ago

Maybe, the easiest would be to add an argument process_data = TRUE to gglikert(). If TRUE, data will be processed with gglikert_data() (as currently), if FALSE, data will be used as it, allowing to provide a custom dataset.

sda030 commented 6 months ago

Sounds good, from the top of my mind. Let me know if you need me to test a solution.

larmarange commented 6 months ago

I finally adopted a data_fun argument. Please see #61

larmarange / ggstats

New argument `data_fun` to `gglikert()` #60