spsanderson / healthyR

Hospital Data Analysis Workflow Tools
https://www.spsanderson.com/healthyR
Other
30 stars 3 forks source link

Suggested feature for diverging_lollipop_plt() [with code] #151

Open ibecav opened 1 year ago

ibecav commented 1 year ago

Is your feature request related to a problem? Please describe. Right now you are relying on the user to do something like convert to a z score so that the center point is zero

Describe the solution you'd like Would be nice if you calculated the mean so they could plot in raw units either as an option or by default

Describe alternatives you've considered How about some simple code like this...

mean_y <- data_tbl %>% summarise(mean({{ y_axis_var }})) %>% deframe()

and

ggplot2::geom_segment( ggplot2::aes(y = mean_y, x = {{ x_axis_var }}, yend = {{ y_axis_var }}, xend = {{ x_axis_var }}), color = "black") +

Additional context Add any other context or screenshots about the feature request here.

ibecav commented 1 year ago

Actually as long as I'm at it and eager to use it. A bunch of suggestions implemented in code for your consideration. I tried to emulate your style choices as much as possible...

diverging_lollipop_plt <-
  function(.data, 
           .x_axis, 
           .y_axis,
           .centrality_measure = "mean",
           .reverse_sort = FALSE,
           .bubble_fill_color = "black",
           .bubble_text_color = "white",
           .bubble_size = 6,
           .line_color = "black",
           .plot_title = NULL, 
           .plot_subtitle = NULL,
           .plot_caption = NULL, 
           .interactive = FALSE) {

    # * Tidyeval ----
    x_axis_var <- rlang::enquo(.x_axis)
    y_axis_var <- rlang::enquo(.y_axis)
    plot_title <- .plot_title
    plot_subtitle <- .plot_subtitle
    plot_caption <- .plot_caption
    interact_var <- .interactive

    # * Checks ----

    if (rlang::quo_is_missing(x_axis_var) | rlang::quo_is_missing(y_axis_var)) {
      stop(call. = FALSE, "You must provide both the .x_axis AND .y_axis columns.")
    }

    if (!is.data.frame(.data)) {
      stop(call. = FALSE, "(.data) is missing, please supply.")
    }

    if (!is.logical(.interactive)) {
      stop(call. = FALSE, "You must supply either TRUE or FALSE for .interactive")
    }

    # * Data ----
    data_tbl <- 
      tibble::as_tibble(.data) %>%
      dplyr::mutate({{ .x_axis }} := forcats::fct_reorder(.f = {{ .x_axis }}, .x = {{ .y_axis }}))

    if (.reverse_sort) {
      data_tbl <-
        data_tbl %>%
        dplyr::mutate({{ .x_axis }} := forcats::fct_rev(f = {{ .x_axis }}))
    }

    centrality <-
      data_tbl %>%
      dplyr::summarise(
        Median = median({{ .y_axis }}, na.rm = TRUE),
        Mean = mean({{ .y_axis }}, na.rm = TRUE)
      )

    if(.centrality_measure == "mean") {
      centrality_measure <- centrality$Mean
    } else {
      centrality_measure <- centrality$Median
    }

    # * Plot ----
    g <-
      ggplot2::ggplot(
        data = data_tbl,
        ggplot2::aes(
          x = {{ .x_axis }},
          y = {{ .y_axis }},
          label = {{ .y_axis }}
        )
      ) +
      ggplot2::geom_segment(
        ggplot2::aes(
          y = centrality_measure,
          x = {{ .x_axis }},
          yend = {{ .y_axis }},
          xend = {{ .x_axis }}
        ),
        color = .line_color
      ) +
      ggplot2::geom_point(
        stat = "identity",
        color = .bubble_fill_color,
        size = .bubble_size
      ) +
      ggplot2::geom_text(color = .bubble_text_color, size = 2) +
      ggplot2::labs(
        title    = plot_title,
        subtitle = plot_subtitle,
        caption  = plot_caption
      ) +
      ggplot2::coord_flip() +
      ggplot2::theme_minimal()

    # * Return ----
    if (interact_var) {
      plt <- plotly::ggplotly(g)
    } else {
      plt <- g
    }

    return(plt)
  }
spsanderson commented 1 year ago

I love it, I'll add a few more and credit you in the NEWS file for the issue, a little laid up today but this will be in the next release.

ibecav commented 1 year ago

No worries on credit but appreciated, Hope you feel better soon. I have a few more tweaks I'll drop later but the only thing I steadfastly refuse to do is update the doco, LOL I know how but it's just too tedious for me ;-)

spsanderson commented 1 year ago

No worries at all, thanks for the suggestion and feedback

On Mon, Feb 6, 2023 at 11:36 AM Chuck Powell @.***> wrote:

No worries on credit but appreciated, Hope you feel better soon. I have a few more tweaks I'll drop later but the only thing I steadfastly refuse to do is update the doco, LOL I know how but it's just too tedious for me ;-)

— Reply to this email directly, view it on GitHub https://github.com/spsanderson/healthyR/issues/151#issuecomment-1419380748, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAPCNS5GE3L7RXKA5HAWAH3WWESB7ANCNFSM6AAAAAAURGUW34 . You are receiving this because you were assigned.Message ID: @.***>

-- Steven P Sanderson II, MPH Book on Lulu http://goo.gl/lmrlFI Personal Site http://www.spsanderson.com

ibecav commented 1 year ago

All i have for now. If you don't want to add another package like checkmate to make argument checking easier please just comment it out...

diverging_lollipop_plt <-
  function(.data, 
           .x_axis, 
           .y_axis,
           .centrality_measure = "mean",
           .reverse_sort = FALSE,
           .bubble_fill_color = "black",
           .bubble_text_color = "white",
           .bubble_size = 6,
           .bubble_text_size = 2,
           .line_color = "black",
           .line_width = 1,
           .plot_title = NULL, 
           .plot_subtitle = NULL,
           .plot_caption = NULL,
           .x_label = NULL,
           .y_label = NULL,
           .interactive = FALSE) {

    # * Tidyeval ----
    x_axis_var <- rlang::enquo(.x_axis)
    y_axis_var <- rlang::enquo(.y_axis)

    plot_title <- .plot_title
    plot_subtitle <- .plot_subtitle
    plot_caption <- .plot_caption
    interact_var <- .interactive

    # * Checks ----
    checkmate::assert_numeric(x = c(.bubble_size,
                                    .bubble_text_size,
                                    .line_width))

    if (rlang::quo_is_missing(x_axis_var) | rlang::quo_is_missing(y_axis_var)) {
      stop(call. = FALSE, "You must provide both the .x_axis AND .y_axis columns.")
    }

    if (!is.data.frame(.data)) {
      stop(call. = FALSE, "(.data) is missing, please supply.")
    }

    if (!is.logical(.interactive)) {
      stop(call. = FALSE, "You must supply either TRUE or FALSE for .interactive")
    }

    # * Data ----
    data_tbl <- 
      tibble::as_tibble(.data) %>%
      dplyr::mutate({{ .x_axis }} := forcats::fct_reorder(.f = {{ .x_axis }}, .x = {{ .y_axis }}))

    if (.reverse_sort) {
      data_tbl <-
        data_tbl %>%
        dplyr::mutate({{ .x_axis }} := forcats::fct_rev(f = {{ .x_axis }}))
    }

    centrality <-
      data_tbl %>%
      dplyr::summarise(
        Median = median({{ .y_axis }}, na.rm = TRUE),
        Mean = mean({{ .y_axis }}, na.rm = TRUE)
      )

    if(.centrality_measure == "mean") {
      centrality_measure <- centrality$Mean
    } else {
      centrality_measure <- centrality$Median
    }

    if(is.null(.x_label)) {
      x_var_label <- rlang::as_label(.x_axis)
    } else {
      x_var_label <- .x_label
    }

    if(is.null(.y_label)) {
      y_var_label <- rlang::as_label(.y_axis)
    } else {
      y_var_label <- .y_label
    }

    # * Plot ----
    g <-
      ggplot2::ggplot(
        data = data_tbl,
        ggplot2::aes(
          x = {{ .x_axis }},
          y = {{ .y_axis }},
          label = {{ .y_axis }}
        )
      ) +
      ggplot2::geom_segment(
        ggplot2::aes(
          y = centrality_measure,
          x = {{ .x_axis }},
          yend = {{ .y_axis }},
          xend = {{ .x_axis }}
        ),
        color = .line_color, 
        linewidth = .line_width
      ) +
      ggplot2::geom_point(
        stat = "identity",
        color = .bubble_fill_color,
        size = .bubble_size
      ) +
      ggplot2::geom_text(color = .bubble_text_color, size = .bubble_text_size) +
      ggplot2::labs(
        title    = plot_title,
        subtitle = plot_subtitle,
        caption  = plot_caption,
        x        = x_var_label,
        y        = y_var_label
      ) +
      ggplot2::coord_flip() +
      ggplot2::theme_minimal()

    # * Return ----
    if (interact_var) {
      plt <- plotly::ggplotly(g)
    } else {
      plt <- g
    }

    return(plt)
  }
ibecav commented 1 year ago

My final suggestions for awhile (LOL you're probably happy for that. Back to my day job. I gave it a new name so you can potentially keep things safe... I also documented one hack I found for dual axes which might be helpful when you have this many rows as in mtcars...

# https://github.com/tidyverse/ggplot2/issues/3171
guide_axis_label_trans <- function(label_trans = identity, ...) {
  axis_guide <- guide_axis(...)
  axis_guide$label_trans <- rlang::as_function(label_trans)
  class(axis_guide) <- c("guide_axis_trans", class(axis_guide))
  axis_guide
}

guide_train.guide_axis_trans <- function(x, ...) {
  trained <- NextMethod()
  trained$key$.label <- x$label_trans(trained$key$.label)
  trained
}

make_lollipop_plot <-
  function(.data, 
           .x_axis, 
           .y_axis,
           .centrality_measure = "mean",
           .reverse_sort = FALSE,
           .bubble_fill_color = "black",
           .bubble_text_color = "white",
           .bubble_size = 6,
           .bubble_text_size = 2,
           .line_color = "black",
           .line_width = 1,
           .plot_title = NULL, 
           .plot_subtitle = NULL,
           .plot_caption = NULL,
           .x_label = NULL,
           .y_label = NULL,
           .dual_labels = FALSE,
           .interactive = FALSE) {

    # * Tidyeval ----
    x_axis_var <- rlang::enquo(.x_axis)
    y_axis_var <- rlang::enquo(.y_axis)

    plot_title <- .plot_title
    plot_subtitle <- .plot_subtitle
    plot_caption <- .plot_caption
    interact_var <- .interactive

    # * Checks ----
    checkmate::assert_numeric(x = c(.bubble_size,
                                    .bubble_text_size,
                                    .line_width))

    if (rlang::quo_is_missing(x_axis_var) | rlang::quo_is_missing(y_axis_var)) {
      stop(call. = FALSE, "You must provide both the .x_axis AND .y_axis columns.")
    }

    checkmate::assert_data_frame(x = .data,
                                 min.cols = 2)

    checkmate::assert_logical(c(.interactive, .reverse_sort, .dual_labels))

    checkmate::assert_character(c(.centrality_measure, .bubble_fill_color, 
                                .bubble_text_color, .line_color))

    # * Data ----
    data_tbl <- 
      tibble::as_tibble(.data) %>%
      dplyr::mutate({{ .x_axis }} := forcats::fct_reorder(.f = {{ .x_axis }}, .x = {{ .y_axis }}))

    if (.reverse_sort) {
      data_tbl <-
        data_tbl %>%
        dplyr::mutate({{ .x_axis }} := forcats::fct_rev(f = {{ .x_axis }}))
    }

    centrality <-
      data_tbl %>%
      dplyr::summarise(
        Median = median({{ .y_axis }}, na.rm = TRUE),
        Mean = mean({{ .y_axis }}, na.rm = TRUE)
      )

    if(.centrality_measure == "mean") {
      centrality_measure <- centrality$Mean
    } else {
      centrality_measure <- centrality$Median
    }

    if(is.null(.x_label)) {
      x_var_label <- rlang::as_label(x_axis_var)
    } else {
      x_var_label <- .x_label
    }

    if(is.null(.y_label)) {
      y_var_label <- rlang::as_label(y_axis_var)
    } else {
      y_var_label <- .y_label
    }

    # * Plot ----
    g <-
      ggplot2::ggplot(
        data = data_tbl,
        ggplot2::aes(
          x = {{ .x_axis }},
          y = {{ .y_axis }},
          label = {{ .y_axis }}
        )
      ) +
      ggplot2::geom_segment(
        ggplot2::aes(
          y = centrality_measure,
          x = {{ .x_axis }},
          yend = {{ .y_axis }},
          xend = {{ .x_axis }}
        ),
        color = .line_color, 
        linewidth = .line_width
      ) +
      ggplot2::geom_point(
        stat = "identity",
        color = .bubble_fill_color,
        size = .bubble_size
      ) +
      ggplot2::geom_text(color = .bubble_text_color, size = .bubble_text_size) +
      ggplot2::labs(
        title    = plot_title,
        subtitle = plot_subtitle,
        caption  = plot_caption,
        x        = x_var_label,
        y        = y_var_label
      ) +
      ggplot2::coord_flip() +
      ggplot2::expand_limits(x = c(0, nrow(data_tbl) + 1)) +
      ggplot2::theme_minimal() +
      ggplot2::theme(panel.grid.major.y = element_blank(), 
                     panel.grid.minor.y = element_blank())

    if(.dual_labels) {
      g <- g + guides(y.sec = guide_axis_label_trans(~ .x))
    }

    # * Return ----
    if (interact_var) {
      plt <- plotly::ggplotly(g)
    } else {
      plt <- g
    }

    return(plt)
  }
spsanderson commented 1 year ago

I am working on this, working through some bugs and trying to think of the best way to go about this, as a centraility measure on it's own is only good for a single line on the graph that would visualize whatis above/below it. This would still require finding the spot in that measure where each item exists.

ibecav commented 1 year ago

No rush on my account. I was actually inspired by your standardizing the variable to go with mean and median but trivial to allow the user to chose mean or median or a user supplied value if they have some known domain quantity or other measure they wish to impose.Sent from my mobile please forgive my brevity On Feb 16, 2023, at 11:29, Steven Paul Sanderson II, MPH @.***> wrote: I am working on this, working through some bugs and trying to think of the best way to go about this, as a centraility measure on it's own is only good for a single line on the graph that would visualize whatis above/below it. This would still require finding the spot in that measure where each item exists.

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you authored the thread.Message ID: @.***>